Claire Thomas Gaggiotti | Rulex

Explainable AI in Life Sciences

Claire Thomas Gaggiotti — Tue, 20 Feb 2024 08:00:45 +0000

While AI offers significant potential in life sciences, its implementation comes with several challenges, ranging from the pure size of medical databases, to mandatory regulatory compliance and the ethics of using black-box models in medical decision making.

Rulex Platform’s eXplainable AI has a profound impact on the implementation of AI in this sensitive sector, by producing transparent, human understandable results. This transparency enables medical experts to understand and explain any predictions made, while guaranteeing ethical data models and results, and adherence of privacy regulations. Simple interpretability is essential for gaining trust and understanding the rationale behind medical decisions, and enables a healthy balance in human-AI collaboration.

Rulex Platform can also easily gather, aggregate and analyse extremely large datasets in any format, and from any source, while integrating with underlying information systems, such as electronic health records, or laboratory information management systems, without causing disruption and upheaval. Results can also be produced in any format required, whether that is an e-mail with urgent results, a tailored spreadsheet saved on a common server, or an interactive dashboard to show colleagues.

For its inherent explainability and agility in data management, Rulex Platform has been chosen by medical healthcare and life sciences organizations to leverage medical records, resulting in improved health outcomes, enhanced clinical and operational decision-making, and pioneering research.

Improving data quality in hospital discharge reports
Tailoring Diagnostic Predictions for Primary Biliary Cholangitis
Identifying Correlations with XAI to Improve Metabolic Control in Type 2 Diabetes
Extracting Rules to Diagnose Pleural Mesothelioma
Extracting a Simplified Gene Expression Signature for Neuroblastoma Prognosis
Extracting Intelligible Rules in Neuroblastoma Prognosis
Validating a New Classification for Multiple Osteochondromas Patients
Predicting Obstructive Sleep Apnea in People with Down Syndrome
Benchmarking LLM Performance on Standard Biomedical Datasets

1. Improving Data Quality in Hospital Discharge Reports

Health check systems in Italy are overseen by regional and local health authorities, who actively monitor and regulate the quality of healthcare services to ensure their appropriateness. Over time, numerous Italian regions have developed and revised guidelines and operational procedures aimed at scrutinizing hospital discharge reports and medical records.

The significance of accuracy in medical records cannot be overstated, as errors can lead to various repercussions, ranging from minor billing discrepancies to critical issues such as incomplete or incorrect diagnoses, or delays in scheduling surgical interventions.

In collaboration with Deimos, Rulex leveraged their eXplainable Artificial Intelligence (XAI) technologies to automate the scrutiny of coding in hospital discharge forms within the Alto Adige health authority. The primary focus of the study was to assess the feasibility of applying automatic checks, characterized as logical clinical checks, not only to ensure compatibility between sex-diagnosis or age-diagnosis, as traditionally done with formal logical checks, but also to explore the intricate relationships between clinical variables in hospital discharge reports. This approach aimed to automatically identify inconsistencies among diagnosis, surgery, medical procedures, and Diagnosis-Related Groups (DRGs).

The tested methodologies yielded promising results. Validation rules were defined, resulting in improved efficiency in automatic record checks and identification of probable location of errors, the personnel time required for record checking was significantly reduced, and automatic checks were carried out on all surgical hospital discharge records, not only a test subset. Overall, the innovative approach not only enhanced the precision of existing checks but also introduced a more comprehensive and nuanced evaluation of the relationships within medical records.

Related research paper (in Italian):

Elisabeth Montel, Astrid Richter, Sabine Ladurner, Agata Malizia, Roberta Vanzetta, Damiano Verda, Marco Muselli, Pierluigi Santin, Paolo Vian, Controllo automatico delle variabili cliniche della scheda di dimissione ospedaliera (SDO), mediante l’utilizzo di tecniche di machine learning, Frg Editore, 2021.

2. Tailoring Diagnostic Predictions for Primary Biliary Cholangitis

Precision medicine seeks to customize the diagnosis, monitoring, and management of individuals based on their unique genetic and environmental backgrounds. This undertaking is particularly challenging due to the intricate nature of medical traits and the presence of multiple variants. The complexity is further amplified when addressing rare diseases, where limited historical data poses an additional hurdle.

In collaboration with the medical departments of Milano-Bicocca and Humanitas universities, Rulex conducted a pioneering study to assess the feasibility and precision of predicting the risk of Primary Biliary Cholangitis (PBC) using eXplainable AI (XAI). The focus was on identifying novel patient subgroups, disease sub-phenotyping, and risk stratification.

The XAI algorithm was applied to an extensive international dataset of PBC patients, divided into a training set, with 11,819 subjects, and a validation set, with 1,069 subjects, with a meticulous analysis of key clinical features. The primary outcome was a composite of liver-related death or liver transplantation, assessed through a combination of machine learning and standard survival analysis.

The analysis revealed four distinct patient clusters, each characterized by unique phenotypes and long-term prognoses. These findings represented a pivotal milestone in formulating a targeted treatment approach for PBC. Additionally, they laid the foundation for ongoing efforts in identifying and providing timely treatment for the relatives of patients, confirming the potential of XAI in advancing precision medicine for complex diseases.

Related research paper:

Alessio Gerussi, Damiano Verda, Davide Paolo Bernasconi, Marco Carbone, Atsumasa Komori, Masanori Abe, Mie Inao , Tadashi Namisaki, Satoshi Mochida, Hitoshi Yoshiji, Gideon Hirschfield, Keith Lindor, Albert Pares, Christophe Corpechot, Nora Cazzagon, Annarosa Floreani, Marco Marzioni, Domenico Alvaro, Umberto Vespasiani-Gentilucci , Laura Cristoferi, Maria Grazia Valsecchi, Marco Muselli, Bettina E Hansen, Atsushi Tanaka, Pietro Invernizzi, Machine learning in primary biliary cholangitis: A novel approach for risk stratification, Wiley, Dec 2021.

3. Identifying Correlations with XAI to Improve Metabolic Control in Type 2 Diabetes

One of the primary goals of diabetologists is to establish an effective metabolic control in type 2 diabetes patients, measured through hematic levels of HbA1c, without causing weight gain.

The Italian diabetology association used Rulex’s proprietary XAI to extract and rank the factors most strictly associated to reducing HbA1c levels. The study involved vast amounts of raw data, including the medical records of 2 million diabetic patients, and the data collected from medical visits over a 10-year period, with over 137 variables per patient.

Significant correlations were identified, such as the use of specific receptor agonists, while it was established that HbA1c and weight-gain have different determinants. These results lead to more efficient patient care for diabetic patients.

Related research paper:

Carlo Bruno Giorda, Federico Pisani, Alberto De Micheli, Paola Ponzani, Giuseppina Russo, Giacomo Guaita, Rita Zilich, Nicoletta Musacchio on behalf of the Associazione Medici Diabetologi (AMD) Annals Study Group, Determinants of good metabolic control without weight gain in type 2 diabetes management: a machine learning analysis, BMJ Open Diabetes Research and Care.

4. Extracting Rules to Diagnose Pleural Mesothelioma

Malignant pleural mesothelioma (MPM) is a rare and highly lethal tumor, with its incidence rising rapidly in developed countries due to past asbestos exposure in various environments. Accurate diagnosis of MPM faces challenges, as atypical clinical symptoms often lead to potential misdiagnoses with other malignancies (especially adenocarcinomas) or benign inflammatory or infectious diseases (BD) causing pleurisies. While cytological examination (CE) can identify malignant cells, a notable false negative rate may occur due to the prevalence of non-neoplastic cells. Additionally, a positive CE result alone may not distinguish MPM from other malignancies.

Various tumor markers (TM) have proven to be valuable complementary tools for MPM diagnosis. Recent studies focused on three tumor markers in pleural effusions: soluble mesothelin-related peptide (SMRP), CYFRA 21-1, and CEA. Their concentrations were analyzed in association with the differential diagnosis of MPM, pleural metastasis from other tumors (MTX), and BD. SMRP demonstrated the best performance in distinguishing MPM from both MTX and BD, while high CYFRA 21-1 values were linked to both MPM and MTX. Conversely, elevated CEA concentrations were primarily observed in patients with MTX. Combining information from the three markers and CE could form a classifier to separate MPM from both MTX and BD.

In this context, the Rulex Logic Learning Machine (LLM) was employed for the differential diagnosis of MPM by identifying straightforward and understandable rules based on CE and TM concentrations. Comparative analyses with other supervised methods, including Decision Trees, K-Nearest Neighbors, and Artificial Neural Networks, revealed that LLM consistently outperformed all competing approaches.

Related research paper:

Parodi, R. Filiberti, P. Marroni, R. Libener, G.P. Ivaldi, M. Mussap, E. Ferrari, C. Manneschi, E. Montani, M. Muselli, Differential diagnosis of pleural mesothelioma using Logic Learning Machine, BMC Bioinformatics 16.S9 (2015): S3.

5. Extracting a Simplified Gene Expression Signature for Neuroblastoma Prognosis

The outcome of cancer patients is, in part, influenced by the gene expression profile of the tumor. In a prior study, a 62-probe set signature (NB-hypo) was identified for detecting tissue hypoxia in neuroblastoma. This signature effectively stratified neuroblastoma patients into good and poor outcome groups. Establishing a prognostic classifier was crucial for grouping patients into risk categories, aiding in the selection of tailored therapeutic approaches.

To enhance the accuracy of predictors and create robust tools for clinical decision support, novel classification and data discretization approaches were explored. In this study, Rulex was employed on gene expression data, specifically using the Attribute Driven Incremental Discretization technique to transform continuous variables into simplified discrete ones. This pre-processing step facilitated rule extraction through the Logic Learning Machine (LLM). The application of LLM yielded 9 rules, primarily based on the relative expression of 11 probe sets. These rules proved highly effective as predictors, validated independently and confirming the efficacy of the LLM algorithm on microarray data and patient classification.

The LLM demonstrated efficiency comparable to Prediction Analysis of Microarray and Support Vector Machine, surpassing other learning algorithms like C4.5. Rulex conducted feature selection, resulting in a new signature (NB-hypo-II) comprising 11 probe sets, identified as the most relevant in predicting outcomes. This comprehensive approach underscores the potential of utilizing LLM in the development of reliable prognostic classifiers for cancer patients.

Related research paper:

Cangelosi, M. Muselli, S. Parodi, F. Blengio, J. Koster, A. Schramm, A. Garaventa, C. Gambini, l. Varesio, Use of Attribute Driven Incremental Discretization and Logic Learning Machine to build a prognostic classifier for neuroblastoma patients, BMC bioinformatics 15.S5 (2014): S4.

6. Extracting Intelligible Rules in Neuroblastoma Prognosis

Neuroblastoma, the most common pediatric solid tumor, poses a significant challenge as approximately fifty percent of high-risk patients do not survive treatment. The urgent need for improved stratification strategies led to the exploration of new, more effective approaches. Hypoxia, characterized by low oxygen tension in poorly vascularized tumor areas, is associated with a poor prognosis. This study aimed to develop a prognostic classifier for neuroblastoma patients by integrating existing knowledge of clinical and molecular risk factors with the NB-hypo signature.

The focus was on creating classifiers that produce explicit rules easily applicable in a clinical setting. The Logic Learning Machine, known for its accuracy, seemed promising for achieving the study’s objectives. The algorithm was employed to classify neuroblastoma patients based on key risk factors: age at diagnosis, INSS stage, MYCN amplification, and NBhypo. The algorithm successfully generated clear classification rules that aligned well with established clinical knowledge.

To enhance stability, an iterative process identified and removed examples causing instability in the rules from the dataset. This refined workflow resulted in a stable classifier highly accurate in predicting outcomes for both good and poor prognosis patients. The classifier’s performance was further validated in an independent dataset. Notably, NB-hypo emerged as a crucial component of the rules, demonstrating a strength comparable to tumor staging. This comprehensive approach showcases the potential of the Logic Learning Machine in developing a robust prognostic classifier for neuroblastoma patients.

Related research paper:

Verda, S. Parodi, E. Ferrari, and M. Muselli, Analyzing gene expression data for pediatric and adult cancer diagnosis using logic learning machine and standard supervised methods, BMC bioinformatics 20(9) (2019), 390.

7. Validating a New Classification for Multiple Osteochondromas Patients

Multiple osteochondromas (MO), formerly recognized as hereditary multiple exostoses (HME), is an autosomal dominant disorder marked by the development of benign cartilage-capped bone growths known as osteochondromas or exostoses. Despite various clinical classifications proposed, a consensus remains elusive. This study aimed to validate an “easy-to-use” tool, employing a machine learning approach, to categorize MO patients into three classes based on the number of affected bone segments, the presence of skeletal deformities, and/or functional limitations.

The proposed classification, assessed through the Switching Neural Network underlying the Logic Learning Machine technique, demonstrated a highly satisfactory mean accuracy. A comprehensive analysis of 150 variables across 289 MO patients facilitated the identification of ankle valgism, Madelung deformity, and limitations in hip extra-rotation as distinctive features (“tags”) of the three clinical classes. In summary, the proposed classification offers an effective system for characterizing this rare disease, enabling the definition of homogeneous patient cohorts for in-depth investigations into MO pathogenesis.

Related research paper:

Mordenti, E. Ferrari, E. Pedrini, N. Fabbri, l. Campanacci, M. Muselli, l. Sangiorgi, Validation of a New Hereditary Multiple Exostoses Classification Through Switching Neural Networks, American Journal of Medical Genetics 161 (2013) 556–560 DOI: 10.1002/ajmg.a.35819.

8. Predicting Obstructive Sleep Apnea in People with Down Syndrome

Obstructive sleep apnea (OSA) is notably prevalent in individuals with Down Syndrome (DS), with reported rates ranging from 55% to 97%, a stark contrast to the 1–4% prevalence in the neurotypical pediatric population. However, conventional sleep studies are often uncomfortable, expensive, and poorly tolerated by those with DS.

To address this, a dataset encompassing over 460 observations was compiled for 102 Down syndrome patients. Each patient underwent a polysomnogram, and the dataset included diverse information such as clinical visit findings, parent surveys, wristband oximeter data, urine proteomic analysis, lateral cephalogram results, and 3D digital photos.

Utilizing the Logic Learning Machine (LLM), a predictive model was developed to ascertain the occurrence of obstructive sleep apnea in individuals with Down syndrome. This approach aimed to offer an alternative to uncomfortable and costly tests like polysomnograms.

The LLM classification task successfully identified a predictive model represented by a set of simple rules, exhibiting a high predictive value of 81.5% for negative cases. Additionally, the Feature Ranking task allowed for the identification of the most relevant variables, assigning a quantitative score to their importance in the predictive model. This innovative methodology not only facilitates a more comfortable diagnosis for individuals with DS but also provides a streamlined and effective means of identifying obstructive sleep apnea.

Related research paper:

G. Skotko, E.A. Macklin, M. Muselli, L. Voelz, M.E. McDonough, E. Davidson, V. Allareddy et al, A predictive model for obstructive sleep apnea and Down syndrome, American journal of medical genetics Part A 173, no. 4 (2017): 889-896.

9. Benchmarking LLM Performance on Standard Biomedical Datasets

In this study, we employed Rulex’s Logic Learning Machine on three benchmark datasets related to distinct biomedical issues. These datasets were sourced from the UCI archive, a repository of data used for machine learning benchmarking. The datasets are as follows:

Diabetes:
- Objective: Diagnosing diabetes based on the values of 8 variables.
- Patient Characteristics: All 768 patients considered are females, at least 21 years old, and of Pima Indian heritage.
- Cases and Controls: Out of the 768 patients, 268 are effective cases of diabetes, while the remaining 500 are controls.
Heart disease:
- Objective: Detecting heart disease using a set of 13 input variables related to patient status.
- Sample Size: The total sample comprises 270 elements, with 120 cases of effective heart disease and 150 controls.
Donor/acceptor DNA:
- Objective: Recognizing acceptors and donors’ sites in primate gene sequences with a length of 60 (basis).
- Dataset Composition: The dataset consists of 3186 sequences categorized into three classes: acceptor, donor, and none.

The performance of the Rulex Logic Learning Machine (LLM) was compared to other supervised methods, including Decision Trees (DT), Artificial Neural Networks (ANN), Logistic Regression (LR), and K-Nearest Neighbor (KNN). The conducted tests revealed that the results obtained by LLM surpassed those of ANN, DT (which generates rules), and KNN. Moreover, LLM’s performance was found to be comparable to that of LR.

Dataset

Records

Inputs

Classes

LLM

ANN

KNN

Accuracy

Rules

Accuracy

Rules

Accuracy

Diabetes

768

76.52%

76.09%

75.65%

76.52%

68.70%

Heart

270

75.31%

64.20%

72.84%

74.07%

51.85%

DNA

3186

91.98%

90.04%

87.09%

92.57%

40.38%

Related research paper:

M. Muselli, Extracting knowledge from biomedical data through Logic Learning Machines and Rulex, EMBnet Journal 18B (2012), 56–58.

Discover more about Rulex for life sciences & healthcare

learn more

Optimizing data management: the power of data agility for time, cost, and energy savings

Claire Thomas Gaggiotti — Mon, 26 Jun 2023 09:54:37 +0000

Is my data agile?

Agile data is as boundless as your imagination. It can be aggregated from multiple locations, in different formats, shapes, and sizes, seamlessly merged and shaped into the form you want, is consistent, up-to-date, easy to analyze, and opens up a universe of high-quality optimization and AI forecasting opportunities.

If this is a perfect description of your data, stop reading, and get yourself a glass of cava to celebrate. Kudos to you!

If this description sounds like unicorn utopia, and your data dreams have yet to come true, read on…

Why is my data wasting time?

Data is such a multi-faceted beast nowadays. It lurks in various forms, from databases to local files, and even sneaks into our email attachments. It can disguise itself as an innocent MS Excel spreadsheet, a tricky table in a PDF, or an elusive SAP table. But the cruellest twist? It never stops changing!

This relentless shape-shifting poses a colossal challenge for companies far and wide. Picture global organizations spanning countries and continents, where each geographic region collects data with its own unique flair – different formats, orders, and even languages. The result? A labyrinth of incompatible data that devours time like trying to pull on a pair of jeans on a sandy, damp beach.

Why is my data wasting money?

In the vast realm of data, there’s a universal truth: garbage in, garbage out. If your data is riddled with inaccuracies, gaping holes, and inconsistencies, it’s a recipe for disaster. Every operation performed on such flawed data becomes a dance of suboptimal outcomes. Optimization becomes a feeble attempt at “making it a bit better-ization,” while AI forecasts stumble in the realm of approximation.

But the story doesn’t end there. The hidden costs of inadequate data lurk beneath the surface, silently draining your finances ($15 million was the average annual financial cost in 2017, according to Gartner’s Data Quality Market Survey). Inaccurate insights then lead to misguided decisions, wasted resources, and missed opportunities. The consequences ripple through every aspect of your operations, chipping away at profitability.

Why is my data stressing me out?

Data should be your ally, not a source of stress and frustration. Imagine a world where agile data effortlessly unveils hidden treasures, allowing you to explore, filter, and query with ease. Business questions are swiftly answered, useful insights and patterns emerge like shooting stars, and data analysis becomes a dream come true for every analyst.

Alas, research reveals a disheartening truth: data scientists devote a staggering 80% of their time to the laborious tasks of data collection, cleaning, and organization, while 76% of them consider this to be the least enjoyable aspect of their work. Valuable time is squandered on frustrating low-value mundane and repetitive tasks, instead of engaging in innovative endeavours, which can be more invigorating than a double-shot of Italian espresso.

Is there a solution to increase data agility?

In the quest for data agility, there is a knight in shining armour: the smart data management platform.

Yet, beware, for not all heroes are forged alike.

So what key attributes should you seek in selecting a platform that genuinely empowers data agility?

First and foremost, start putting time back in your working day by seeking a platform with the power to gather data from every nook and cranny – your platform should be a fearless explorer, collecting data from anywhere it resides.

Ensuring superior quality is paramount when aiming to drive revenue growth, and your chosen platform must rise to the occasion. Seek a solution that enhances data integrity by effectively addressing missing data and outliers, while also leveraging your business rules as a guiding light for validation. Additionally, harness the power of AI to uncover concealed errors lurking in the shadows, bolstering the overall accuracy and reliability of your data.

Finally, to ensure a stress-free experience, it is imperative that your selected platform possesses the innate ability to seamlessly integrate, reshape and transform data. This capability allows you to focus on the enjoyable nerdy aspects, while presenting results through awe-inspiring and interactive dashboards that captivate and inspire.

Rulex Platform ticks all the boxes (and more)

When it comes to conquering the data agility challenges faced by F50 companies, Rulex Platform reigns supreme. We’ve witnessed first-hand the pain caused by incompatible, low-quality, frustratingly unfathomable data, and that’s why we’ve dedicated a lot of our precious time to the art of data agility.

Not only has our platform become a beacon of transformation, saving our clients millions, accelerating their processes, and delivering remarkable results, we’ve also created a user-friendly what-you-see-is-what-you-get haven, welcoming both data nerds and business experts with open arms.

Hidden behind the simplicity of this drag-and-drop interface lies a powerful capability: every operation seamlessly generates optimized code in the background. This extraordinary feature brings forth a multitude of benefits, including CI/CD versioning and collaborative work, automatically generated documentation, swift customization, effortless integration of external scripts, and hassle-free maintenance and debugging.

While we haven’t mastered the art of selling unicorns just yet, rest assured, it’s on our radar.

Are you ready to unlock the full potential of your data?

It’s time to take action

Discover Rulex Platform and witness the power of true data agility.

Start a free 30-day trial and unlock the full potential of your data.

Why Rulex Lite? An interview with Rulex’s CEO, Marco Muselli

Claire Thomas Gaggiotti — Thu, 02 Mar 2023 08:00:29 +0000

We asked the CEO of Rulex, Marco Muselli, to explain why he has decided to launch an entry-level version of Rulex’s data management system: Rulex Lite.

Why has Rulex started selling an entry-level version of its software?

As we speak, Rulex Platform is being used by many large enterprises around the world to optimize and digitalize their business processes, which is great news for us. But it has always been my ambition to make this powerful technology accessible to anyone who handles data, and needs to perform simple everyday operations, but does not have the programming skills to go beyond Excel spreadsheets.

How have you managed to lower costs so significantly?

We basically removed or reduced the software capabilities that are really only essential for large enterprises and focused on maintaining what is important for personal users and small businesses. So this version doesn’t have the features for working in large groups, and also has a data limit of 10 million cells, which is still more than sufficient for small companies and individuals, and way beyond what Excel can handle. These changes meant we could bring costs right down and offer a very affordable version of the software.

Who do you think would be interested?

There are many small companies and business owners who simply cannot afford expensive enterprise solutions, but would really benefit from this useful technology, and a low-cost license gives them a chance to greatly improve how they handle and get value from their data. Another group that comes to mind is students, and generally the world of academia, especially in non-scientific faculties, such as economics and marketing. This group of people may not have the technical knowledge to build custom solutions, but nonetheless need to handle large amounts of data and build data analysis logic.

How can people learn how to use Rulex Platform?

The software itself is a graphical tool, which means you just drag and drop data files and tasks onto a canvas, where you then work on the data in spreadsheets, so it’s very intuitive. But you can also build pretty complex business logic through its available tasks, so you’ll need to learn a little about how to interact with the software to make the most out of its capabilities. We’ve tried to make this as simple as possible, so although you’ll find initial help in walkthroughs and technical documentation, we’ve also built a strong community, where you can download sample workflows, take free courses, ask for ideas in discussion forums, watch explanatory videos etc. We try to guide new users as much as possible, and don’t take any tech skills for granted.

How can people buy this entry-level license?

People can pick it up directly with a credit card through the Rulex Community store, without the hassle of having to speak to sales reps or fill in endless forms. There is a monthly and annual subscription, so if you don’t want to commit, you can try it out for a few months and see how it goes.

Is there still a free trial?

Absolutely, there is a 30-day free trial to try it out. If you get on well with the software, you can subscribe for a month at a time, with the option of switching to an annual subscription, and saving yourself another 20%, when you’re really appreciating the benefits. With a cost of €95 per month, I feel I’ve realized my ambition to make it accessible to everyone.

Overcoming the restrictions of no-code platforms: welcome to self-coding

Claire Thomas Gaggiotti — Mon, 20 Feb 2023 22:40:02 +0000

You’ve heard the hype. No-code development platforms are so simple to use that anyone can master them. By removing the obstacle of needing to know programming languages and have coding skills, they contribute to the democratization of any new technology. But is it all really that simple? Does this simplicity come at a cost?

What are no-code development platforms?

No-code development platforms allow non-programmers to analyze data and build solutions in drag-and-drop visual editors, without writing a line of code. And if your business problem is pretty standard, that’s all great. Up to a point.

Understanding no-code development limitations

Generally, the simpler the software, the less flexible it becomes. And this is particularly true with no-code platforms, which may come apart as the complexity of needs increases, and you don’t have the necessary building blocks to cover the business logic required. Even if workarounds can be integrated, which is not always the case, they will require specialized programmers and multiple scripts, which are notoriously difficult to keep track of, like an untidy desktop. Customization can be costly and time-consuming, eating into all the advantages you hoped to find with no-code tools.

With remote working on the rise, and as your company hopefully grows, the need for collaborative work on any given solution will also increase. Collaboration requires a code base to merge and store different versions of solutions, which, as you can pick up from the name, no-code platforms do not provide.

The alternative to no-code solutions so far

Faced with the need for customization and collaboration, many companies choose to forego no-code and build their own custom software solution. Custom solutions are inherently tailored to specific business needs (as long as these don’t change). But they present several drawbacks.

Creating in-house tools is costly

The first hurdle is, ironically, cost. Although you would instinctively expect to save money by using an open-source programming language, such as Python, and avoid the initial expense of a no-code platform, building your own solution can work out pretty expensive. Programmers are pricey, and you must either face the cost of hiring an internal team or outsourcing the job. Either option will not work out cheap. Outsourcing may seem to make more sense at the start, but inevitable modifications, integrations, and maintenance will result in additional costs and delays, and total dependence on the service provider.

A problem of time, resources, and technical support

The next issue to face is time. While you can get started pretty quickly with a drag-and-drop interface, building custom applications can be very time-consuming. Another aspect to consider with custom development is that programmers are human (hopefully you knew that). Even when their work is impeccable, each programmer will naturally have a different logical approach to reach the same result. This can make solution maintenance complicated if modifications are not made by the same people who worked on the original project, which is quite likely, given the high turnover rate of software developers. And finally, once the solution is ready and approved, time-consuming reworking is usually required to
effectively make it production-ready, which means more money and more time… Is there an answer?

Welcome to self-coding

There is an innovative solution that unites the advantages and overcomes the disadvantages of both no-code platforms and custom code development: self-coding. Self-code platforms have the initial user experience of an intuitive graphic platform, which allows business users to build multiple applications without writing code. But this apparent simplicity hides an immense flexibility to deal with custom business scenarios. As you drag and drop building blocks, configure tasks, and work with your data, the platform writes the underlying optimized code behind the scenes for you.

Self-code software benefits

It’s a simple WYSIWYG interface for your business users, but it’s also a code solution when you need it. So, what difference does this actually make? It means that multiple people can collaborate on the same solution, creating and merging different versions in a git-like versioning system, and making collaborative work possible on the same solution.

The code base of the solution means that the platform itself can be quickly expanded and enhanced, enabling constant innovation and the extremely fast development of new features. Consequently, customization can be implemented very quickly, often in days, so you are not stuck for months while you wait for a new connector or essential custom feature. If you want to enable your business users to build complex applications independently, while still keeping a flexible solution that allows collaborative working, and save money on expensive programming experts, look into self-code platforms. You can’t really afford not to.

Rulex’s self-code platform

Rulex Platform is an end-to-end data management platform and an innovative example of self-coding. Optimized code is generated for every operation in its drag-and-drop web interface, and each step is tracked in an interactive graphic history table. Each action can be undone, re-executed, saved, or even deleted, and the code behind each action can be viewed with a single click. Rulex Platform also leverages this code base to automatically generate documentation, which can be quickly shared with colleagues, so you don’t have to keep track of your actions. You can even add custom comments, which get automatically integrated in just the right place in the final documentation.

Developing solutions fast

In combination with self-coding, an additional peculiarity of Rulex Platform is its simplicity and speed in developing solutions, by keeping the underlying data visible during every development phase. As you add logical elements to your workflow and manipulate your data, you can execute each part in seconds, immediately check the results, and change track if required. And once you are satisfied with the results, the workflow is production-ready, speeding up change management, and avoiding wasting time with reworking.

Knowing more about self-coding

Any developer or tech lover who wants to discover the nitty-gritty of self-code development, check out the tech blog on Rulex Community: Advantages of self-coding platforms for developers.

test IT yourself with our self-code platform

Start a 30-day free trial of Rulex Platform, and try out self-coding yourself to develop your projects.

Smarten up your everyday data management

Claire Thomas Gaggiotti — Fri, 10 Feb 2023 09:00:23 +0000

Gathering and merging data from multiple sources and formats can be a huge initial hurdle to overcome for many businesses. Importing data into Rulex Platform really is as simple as dragging and dropping a task.

The first step in any data management process is gathering your data. But when you start looking at what you have, you’ll soon find that it’s pretty messy and disorganized. You may have transactional data on an SAP database, numerical data stored in MS Excel files on a SharePoint repository, and text files saved locally.

To complicate matters further, each dataset is likely to be structured according to its purpose. Transactional data may be based on the Order ID, customer service data on the Customer ID, and product data naturally on the Product ID.

So, how can we merge all these sources, and make any sense of it all?

Getting off the starting block to data collection

Getting started with a data management plan is always the hardest part. It may leave businesses, big or small, on the starting block, wondering what to do first.

In Rulex we understand your pain, so we’ve made sure that Rulex Platform has all the tools you need to be the first off the block.

Let’s go through the key actions data scientists and business analysts perform when importing and merging data on Rulex Platform.

Importing data from different sources

It really is as simple as dragging and dropping a task and selecting the source where your data are stored.

Rulex Platform is all WYSIWYG, so while you change options in the task, you get to see a preview of what data you are about to import, and how.

So, what databases are supported? Pretty much all the commonly used databases available on the market. Including SQL Server, MySQL, Postgres, Teradata, Hive, Impala, Spark, Azure Synapse, DOMO, Snowflakes, Oracle, SAP 4 HANA, and IBM DB2 series. And the list keeps growing.

Once you have set up the connection parameters for your database or file system, you can save them, and even set permissions to share them with other users.

And if your data are in the cloud? No problem. Rulex Platform supports AWS S3, Microsoft SharePoint, FTP/S and HTTP/S servers, Hadoop HDFS filesystems, Share drives, Azure Files, BLOB Storage, and many more.

Blending data with different file formats

Data can be imported from practically anywhere, but what about the format? Everyone knows that each format has its own structure and requirements.

To speed up the process, there are separate tasks for the main data types, such as MS Excel, text files (csv, tab, txt), XML, JSON.

Once imported, Rulex Platform automatically converts the file into a single table format, even working out the data type of each column. Whether your original data was in MS Excel, CSV, XML or an SAP table, the imported results will all look the same in Rulex. So it’s then really easy to quickly reshape these tables, and blend them into a single spreadsheet.

If you’d like to know more about table transformation, check out our article on Rulex Community: Could you please reshape my table?

Harmonizing data

Summing up what we’ve seen so far, Rulex Platform not only allows you to import multiple data formats, from multiple sources, but also merge all these files into a single spreadsheet, providing you with a user-friendly data view in just a few easy steps, so you can start getting the answers you were looking for.

Handling really big datasets

As your business grows, the data you have at your disposal does too. To the point where it is difficult to handle. Excel files may grow to such as extent that Excel itself has problems opening them, and even simple data-prep operations become excruciatingly slow.

Thus, questions start arising. Is there a risk in merging all our data into a single dataset? Will it get so big that even sorting columns will become painfully slow?

The simple answer is: no. Rulex Platform can handle vast amounts of data extremely quickly. For example, it can sort 5 million rows of data in 2.2 seconds. Impressive.

Exporting data in whatever format

Once your data have been imported and elaborated, you can export the results how and where you want. Just drag an export task onto the canvas, and select the format you want and the destination, which can even be via email to a list of recipients.

Using REST APIs to import/export data

Rulex Platform has a REST API that allows you to programmatically import data into the platform. This method can be useful if you want to automate the data import process or if you want to integrate Rulex with other systems.

Ready to import and combine data on Rulex Platform?

You can download a free trial from Rulex Community, where you’ll also find discussion forums, articles, interactive courses, sample flows, and videos to get you off to a flying start with your data.

And just to answer your last question – is Rulex Platform too expensive for a small business or personal user? The answer is no, with a starting price of €95 / month Rulex Platform is very accessible.

Handle your data like a data scientist

Claire Thomas Gaggiotti — Mon, 27 Sep 2021 17:47:23 +0000

2 big reasons why you should drop spreadsheets and start working like a data scientist

In our experience in Rulex, people want to understand the basics of data science either because they want to do their job better, or because they want a better job.

If you’re reading this article, you most likely identify with one of these two groups, so let’s understand the benefits of dealing with data like a data scientist, and WIIFY (what’s in it for you).

Do your job better

If you’re currently handling data through spreadsheets, upgrading to a data analytics tool adopting data science techniques will have a major impact on your daily work.

How?

By enabling you to get game changing insight independently from your data in minutes not hours, and freeing up time for more strategic work (or even allowing you to leave work on time for once). Let’s dive in deeper…

1. Work independently with confidence

Do you ever need to produce results with zero warning?
Like reporting on average sales on your latest product for a meeting at 10am?
Drag and drop, no-code analytics platforms mean you can get the answers you need autonomously from your data without having to ask (and wait for) your IT department.

2. Work efficiently

Is your data spread over different spreadsheets and database tables, or on remote repositories? Each one designed by a different person, with different ideas?
How on earth are you going to be able to work with all these sources without having a copy & paste meltdown?
Data analytics tools allow you to reshape, merge, and concatenate data from multiple sources and formats quickly and simply.

3. Make your work less stressful and repetitive

Do you ever feel like you’re re-inventing the wheel every day?
Every time your data changes, you have to painstakingly adapt your worksheet formulas one by one.
Data analytics tools let you build workflows which automatically adapt to changes.
For example, with Rulex you can build resilient workflows without writing a single line of code, and schedule them to run daily, sending the results to your inbox while you’re drinking your coffee, in time for that 9am morning staff meeting.

Get a better job

Over a decade ago, the Harvard Business Review predicted that data scientist would be the sexiest job of the 21st century, and with an average salary now over $100,000 and a 28% increase in jobs by through 2026*, they were right!

Along with an increasing demand from companies who are waking up to the value of data science to increase profitability and gain competitive advantage, these numbers are also driven by the chronic lack of data scientists around the globe, with far more job positions open than available candidates.

So it goes without saying that anyone who has data science skills will find themselves instantly in demand, with infinitely better prospects.

How can I get the skills I need?

Whether a better job for you means a promotion in your current company, or a complete career change, there is no doubt that data science skills will make you stand out from the crowd and increase your career prospects.

So how can you get these lucrative skills?

Sign up for a hands-on self-paced digital training course, which will allow you to build your data science skills step-by-step.
Rulex offers a wide range of beginner-friendly interactive courses on its digital training platform Academy – Rulex AI.
Get professional certification you can share on your social media and add to your curriculum to get your newly acquired skills noticed ASAP Rulex Certification – Rulex AI.
If you’re still in higher education, take data science modules, whether your major is in marketing, supply chain or literally whatever! Don’t underestimate the value of these skills in any field. If your university doesn’t offer such courses, you’ll find that some of the top AI companies do. Rulex University Program – Rulex AI

References

*U.S. Bureau of Labor Statistics
Towards Data Science: Is Data Science Still a Rising Career in 2021 – Examining the Demand, Supply, and Growth of the Data Science industry to see if it’s still a rising career in 2021
https://towardsdatascience.com/is-data-science-still-a-rising-career-in-2021-722281f7074c.
Data Science Central: The Top Skills for a Career in Datascience in 2021 https://www.datasciencecentral.com/profiles/blogs/the-top-skills-for-a-career-in-datascience-in-2021

Stop wasting time on data preparation

Claire Thomas Gaggiotti — Mon, 30 Aug 2021 08:15:46 +0000

Stop wasting time on data preparation – Rulex AI

It takes time to get to know your data. Understanding your data and preparing it for analysis is a critical step for any successful project.

But as any data analyst will tell you, data preparation is a time-consuming process, and often involves multiple software tools: one to extract data, one to transform and convert, and finally another tool to upload the results.

Tracking your activities further slows down this process, but is crucial for rolling back operations, or simply for explaining what you did.

How Rulex makes your life easier

1. Single environment

In Rulex all data import and preparation is performed in a single no-code platform where you can visualize results instantly.

Import your data, then manipulate them to get straight to the results you need: concatenation, discretization, time series analysis, statistics, formulas, reshaping … everything you could possibly need to not only understand your data, but also prepare them for advanced analytics.

Export your results from the same single working environment to local files, remote repositories, databases or even send files via email to a list of recipients each time you run the workflow.

2. Multiple sources and formats

Import your data from multiple different local and remote sources, in many different formats (XML, MS Excel, TXT, CSV, JSON etc), and merge them into a single dataset.

Instead of wasting time planning how datasets may interconnect in theory, based on their metadata, try it out in Rulex with your real data, thanks to extremely fast computational speed.

Check the results, tweak some changes, and once you’re satisfied, the job is done.

No additional implementation is required.

3. Rulex traces all your operations, so you don’t have to.

Thanks to Rulex’s automatic tracing of operations, rolling back in any part of the transformation process is a question of a few mouse clicks.

Technical and operational documentation can be generated from comments, so explaining and documenting the change process is quick and painless.

Data Manager – the heart of data preparation

The heart of data preparation phase is the Data Manager task, where you can:

Explore your data through graphical tools, such as plots, curves and pies, and an array of single and bivariate statistics, to understand if the data at hand is appropriate for solving your problem. Plots can be saved and added to presentations to explain your results to team members.
Filter, group and sort large datasets in seconds, vastly improving dataset usability and interpretation.
Clean up your data to maintain high levels of data quality by:
- removing unnecessary attributes,
- identifying and removing outliers which can negatively impact results,
- standardizing how missing values are expressed
- correcting incorrect values whenever possible.
Enrich your dataset with new values derived from existing attributes, thanks to the wide range of formulas.

The First Rule About AI Club: You Don’t Talk About AI

Claire Thomas Gaggiotti — Sun, 06 Jun 2021 08:32:29 +0000

How focusing on decisions can help you productionize your next AI project

The internet is the best place in the world to turn your suspicions into nightmares. Suspecting your partner is cheating on you? Every Facebook Like will confirm it. Feeling a little dizzy? Dr. Google will immediately diagnose how many days you have left to live. Wondering if the Earth might really be flat? Well, you get the idea… Shifting to your workplace:

If you are wondering whether your company should adopt AI, the web will serve you crucial “insight” on the imminence of your bankruptcy if you don’t act immediately.

What differentiates AI from ML?

To clearly understand the difference between AI and ML, I personally like John McCarthy’s definition of AI, as it is very simple:

“AI involves machines that can perform tasks that are characteristic of human intelligence.”

Such tasks include things like understanding natural language, identifying objects in images, recognizing sounds, and playing complex strategy games. I find this definition very powerful as it does not put any stress on the underlying technology. It basically tells us that AI is a glorified version of good-old process automation, which now includes human-centric processes that weren’t possible just a decade ago.

ML, at its core, is nothing but one of the many technologies used to achieve AI. Disruptive, innovative, sexy… but still just a technology. If we don’t untangle this difference, we will find ourselves asking the mother of all incorrect questions: “What problems can I solve with ML?”.

This question unconsciously traps you into searching for the right problems for the technology at your disposal, which is never a good business approach. As the famous psychologist Abraham Maslow once stated: “If all you have is a hammer, everything looks like a nail”.

The problem is that your company might not need a nail at all.

Don’t get me wrong, I’m not saying you should never ask yourself this question. I’m simply saying that this should not be part of any AI-strategy conversation. It’s a headache your data science team will be more than happy to take on.

How to NOT fail

After years of experience in complex supply chain automation (working mostly in these scenarios), I’ve seen projects fail for reasons ranging from using the wrong technology to manipulating dirty data or lack of team cooperation. While addressing these problems is clearly important, not understanding the logic behind decisions and overlooking their impact on the business is by far the most lethal mistake.

Focus on the decisions you want to automate, not on the technology.

Decisions are the natural outcome of any learning process; we learn things to better react to the situations we face and to avoid previous mistakes. At the end of the day, introducing AI in your company is nothing but allowing machines to transform your data into decisions. That’s why, in every successful project, we have always started with “the end” in mind, focusing on the output we wanted to create and asking ourselves: what decisions are we trying to automate? by how much and when do we want to improve the decision-making process we are looking at?

It’s all about Decision Automation

It was interesting to notice how focusing on understanding decision logic led us to become increasingly detached from technological conversations. The term AI basically disappeared and was replaced with “Decision Automation”, which, while not a new concept, isolates the final outcome and its scope of work: enhancing the quality of our decisions and removing humans from the part of the underlying process which does not require judgment, creativity or control.

Let AI do (only) what it can do better than you.

Focusing on decisions can greatly help us build a simple framework that can better identify and tackle our next AI-project.

Start by asking the right questions

Some of the questions we might want to ask ourselves are:

Are we looking at operational or strategic decisions? Operational decisions happen daily and repeatedly; they are often boring and unedifying for the people in charge. They rely on well-defined rules or logic and are therefore the perfect candidates for automation. For example, saving time while reducing inefficiencies should be the focus of our attention when replenishing distribution centers, identifying fraudulent claims, or non-performing loans. On the other hand, strategic decisions such as “should I make this investment?” or “should I partner with this company?” require unstructured insight, which quite simply cannot be automated, but only, as defined by Gartner, “augmented” by using the right technology.
What is the impact of wrong decisions? Being able to shape the effect of wrong decision-making, both in terms of lost money and people affected, is essential when prioritizing the tasks you want to automate. Experiencing recurrent out-of-stocks or overstocks could lead you to optimize your replenishment process, while a large part of the problem might be due to an incorrect setup of your master data, which is affecting not only distribution but also production, transportation, and forecasting (true story, by the way).
Can we decide fast enough to modify events in due course? While the quality of our decisions might be good enough, the process involved to reach these results may be excessively taxing on the business. For example, most business-critical activities in the supply chain are still done manually or semi-manually, and they are consequently lacking in flexibility and resilience.
Do we know the logic behind the decisions? Do we know how and why something happens? Finally, we talk about technologies! If the answer is yes, technology can provide businesses with support through RPAs (learn more on RPA) and rule automation for simpler tasks, and low-to-no code ETL tools and optimizers for the more challenging ones. If the answer is no, but there is an underlying logic, then ML can dig it out from the data. An example is customer churn analysis, as it is impossible to predict upfront what drives customers to leave, but that information is probably hidden inside the data.

In conclusion

Being able to provide quantitative answers, such as the number of decisions involved, the inherent cost of wrong decisions, or the man/hours needed to support the process, should be the gateway to automation investments.
These answers help us build compelling cases to convince management of its value and serve us as leading indicators, whether we are ready or not to invest in innovation and automation.

Claire Thomas Gaggiotti | Rulex

Explainable AI in Life Sciences

Table of contents

1. Improving Data Quality in Hospital Discharge Reports

Related research paper (in Italian):

2. Tailoring Diagnostic Predictions for Primary Biliary Cholangitis

Related research paper:

3. Identifying Correlations with XAI to Improve Metabolic Control in Type 2 Diabetes

Related research paper:

4. Extracting Rules to Diagnose Pleural Mesothelioma

Related research paper:

5. Extracting a Simplified Gene Expression Signature for Neuroblastoma Prognosis

Related research paper:

6. Extracting Intelligible Rules in Neuroblastoma Prognosis

Related research paper:

7. Validating a New Classification for Multiple Osteochondromas Patients​

Related research paper:

8. Predicting Obstructive Sleep Apnea in People with Down Syndrome

Related research paper:

9. Benchmarking LLM Performance on Standard Biomedical Datasets

Related research paper:

Optimizing data management: the power of data agility for time, cost, and energy savings

Is my data agile?

Why is my data wasting time?

Why is my data wasting money?

Why is my data stressing me out?

Is there a solution to increase data agility?

Rulex Platform ticks all the boxes (and more)

Are you ready to unlock the full potential of your data?

It’s time to take action

Why Rulex Lite? An interview with Rulex’s CEO, Marco Muselli

Why has Rulex started selling an entry-level version of its software?

How have you managed to lower costs so significantly?

Who do you think would be interested?

How can people learn how to use Rulex Platform?

How can people buy this entry-level license?

Is there still a free trial?

Overcoming the restrictions of no-code platforms: welcome to self-coding

What are no-code development platforms?

Understanding no-code development limitations

The alternative to no-code solutions so far

Creating in-house tools is costly

A problem of time, resources, and technical support

Welcome to self-coding

Self-code software benefits

Rulex’s self-code platform

Developing solutions fast

Knowing more about self-coding

test IT yourself with our self-code platform

Smarten up your everyday data management

Getting off the starting block to data collection

Importing data from different sources

Blending data with different file formats

Harmonizing data

Handling really big datasets

Exporting data in whatever format

Using REST APIs to import/export data

Ready to import and combine data on Rulex Platform?

Handle your data like a data scientist

2 big reasons why you should drop spreadsheets and start working like a data scientist

Do your job better

Get a better job

Stop wasting time on data preparation

How Rulex makes your life easier

1. Single environment

2. Multiple sources and formats

3. Rulex traces all your operations, so you don’t have to.

Data Manager – the heart of data preparation

The First Rule About AI Club: You Don’t Talk About AI

How focusing on decisions can help you productionize your next AI project

What differentiates AI from ML?

How to NOT fail

It’s all about Decision Automation

Start by asking the right questions

In conclusion

7. Validating a New Classification for Multiple Osteochondromas Patients