Rule-based Validation: 3 Reasons Why Rulex Does It Better



Fabio Rivieccio



February 28, 2024



Data Management , Data Quality

← Previous Next →

On September 23, 1999, at 09:00:46 UT, the NASA spacecraft Orbiter lost contact with Earth as it passed behind Mars. The anticipated reconnection, 27 minutes later, never occurred – by that time, the spacecraft had crashed onto the red planet. Subsequent investigations revealed that the incident was caused by commands not being converted from English units to the metric standard.¹

On April 8, 2018, a Samsung Securities worker inadvertently entered “shares” instead of the Korean currency “won” due to a keyboard error. This led to the accidental distribution of a “ghost” share worth over 100 billion dollars, ultimately causing a significant decline in Samsung stocks, not to mention a loss of credibility.²

What ties these incidents together? Data quality.

Data quality matters
Will a spreadsheet cut it?
Find the expert, spell it out, iterate
Meet the Rule Engine
3 main benefits of the Rule Engine

Data quality matters

Ensuring data quality involves tasks such as checking if values are within range or have the correct format, and it has been the center of many discussions since the early 1990s.³

Data quality issues may originate in the realm of data, but are certainly not confined to it, significantly impacting business efficiency, incurring higher costs and even jeopardizing the success of projects.⁴

To tackle the intricacies of data quality problems, organizations of all kinds are constantly looking for effective solutions that can combine both industry expertise and data knowledge.

Will a spreadsheet cut it?

While spreadsheets may suffice for small datasets with simple rules, they prove inadequate as data volume and rule complexity increase. Suppose you have only one or two data sources that you can merge into a small, unified dataset. If the data quality can be assured with simple rules, such as verifying payment amounts within an expected range, a basic spreadsheet formula might suffice.

However, as business requirements grow more intricate, data volume expands, or the need arises to integrate new sources, spreadsheets really start to feel the strain, along with the people trying to use them. Similar to training wheels for a novice rider in a park, a spreadsheet is of little use to an experienced rider navigating a steep downhill track.

a spreadsheet is of little use to an experienced rider navigating a steep downhill track

Find the expert, spell it out, iterate

So with increased complexity, you’ll need a data quality tool that can handle it. Unless you have a very technical background, you’ll also have to get an expert onboard who can implement your rules, such as a Python programmer.

The sort of script your programmer could produce to perform a simple validation check, such as ensuring an amount lies within the 10,000 to 50,000 range, applicable only to projects categorized as “small” or “medium” in size, would look something like this:

import pandas as pd
data = {
‘amount’: [12000, 30000, 60000, 15000],
‘project’: [‘Small’, ‘Medium’, ‘Large’, ‘Small’]
}
Payments = pd.DataFrame(data)
Payments[‘PaymentStatus’] = ‘INVALID’
mask = ((Payments[‘amount’].between(10000, 50000)) & (Payments[‘project’].isin([‘Small’, ‘Medium’])))
Payments.loc[mask, ‘PaymentStatus’] = ‘VALID’
print(Payments[[‘PaymentStatus’]])

Using an expert to implement the solution is presumably a viable approach, as it allows you to handle volume and complexity. However, it has some important drawbacks:

As the execution of any implementation is not within your control, adapting to changes in requirements can be a bit of a journey, involving scheduling meetings to coordinate with programmers and/or tool specialists.
Despite investing time in clarifying these changes, there’s always a chance that not every detail will be fully grasped or smoothly executed.
And when it comes to integrating new data sources and ensuring they seamlessly align with existing datasets, things can get even more intricate. This can lead to a quick escalation in the effort required, calling for a diverse set of skills to merge and harmonize everything effectively.

The perfect solution would be a tool that can handle high data volumes and varying rule complexities while remaining accessible to a citizen developer.

Meet the Rule Engine

At Rulex, we address data validation challenges with a task called the “Rule Engine“.

This specially designed tool allows users to write business rules in a simple Excel file using an intuitive syntax. The rules can be applied to datasets, and the outputs can be exported to various formats, such as a database, a local file, or via API to an Advanced Planning System.

To assess the validity of our payment data with the Rule Engine, instead of writing a script, it’s sufficient to write a straightforward rule like the following:

IF “amount” > 10000 AND “amount” < 50000 AND “project” in {'Small', 'Medium'} THEN "PaymentStatus" in {'VALID'}

As these rules are written in an external spreadsheet, business users can independently add and modify them, without delving into the intricacies of the workflow, or even needing to know how the software works.

Managing business rules becomes seamless. If the complexity grows, it can be easily addressed thanks to the Rule Engine’s support for formulas within the rule syntax, prioritization of rules (executing fundamental rules first), and the ability to manage rule dependencies.

And if new data sources come into play, they can be imported and merged into the existing flow through a user-friendly drag-and-drop interface.

3 main benefits of the Rule Engine:

SIMPLE: You won’t need to onboard programmers to write complex scripts.
FAST: You can independently modify and test rules and check results in minutes.
FLEXIBLE: You can quickly add new data sources, prioritize rules, and change output, adapting easily to changing needs.

Whether mitigating a space exploration mishap or simply ensuring your business is not losing money, data quality is crucial. The Rule Engine is designed to give citizen developers complete control over the rule management process, enhancing efficiency and contributing to the vigilant maintenance of optimal data quality.

Now is the right time to cast aside those training wheels and confidently navigate your own path along the data trail!

REFERENCES

Fabio Rivieccio

Senior Platform Solutions Architect • Platform Solutions

← Previous Next →

The Data Governance-Security Connection: Protecting Your Business from the Inside Out

by Alice Cartasso | Nov 21, 2024

To implement an effective data strategy and avoid data breaches, companies need both data governance and data security. Read the article to learn how to achieve this.

Weeding the Data Garden: how Rulex Platform Cultivates Quality

by Fabio Rivieccio | Jun 4, 2024

Rulex provides different capabilities to solve various data quality issues, just as a gardener uses various tools to uproot weeds and make the garden bloom.

Optimizing data management: the power of data agility for time, cost, and energy savings

by Claire Thomas Gaggiotti | Jun 26, 2023

Is my data agile? Agile data is as boundless as your imagination. It can be aggregated from multiple locations, in different formats, shapes, and sizes, seamlessly merged and shaped into the form you want, is consistent, up-to-date, easy to analyze, and opens up a...

Superior Data Performance: Rulex Outperforms Pandas

by Matteo Aragone | May 3, 2023

Anyone who works with data knows how crucial performance is, especially when performing complex data processing and data transformation operations on medium to large datasets. At Rulex, we understand this need very well, which is why we have devoted a considerable...

Business rule engine: who rules the rules?

by Maddalena Moretti | May 2, 2023

Have you received a discount from your favorite clothing brand? Business rules were probably involved in the decision-making process. Often brands set business rules that award discounts every time a certain value is reached by the customer. But who defines and...

Why Rulex Lite? An interview with Rulex’s CEO, Marco Muselli

by Claire Thomas Gaggiotti | Mar 2, 2023

We asked the CEO of Rulex, Marco Muselli, to explain why he has decided to launch an entry-level version of Rulex’s data management system: Rulex Lite.

« Older Entries

Discover Rulex Platform

COMPONENTS

CAPABILITIES

Rule-based Validation: 3 Reasons Why Rulex Does It Better

Fabio Rivieccio

February 28, 2024

Data Management , Data Quality

table of contents

Data quality matters

Will a spreadsheet cut it?

Find the expert, spell it out, iterate

Meet the Rule Engine

3 main benefits of the Rule Engine:

REFERENCES

Fabio Rivieccio

Related Posts

The Data Governance-Security Connection: Protecting Your Business from the Inside Out

Weeding the Data Garden: how Rulex Platform Cultivates Quality

Optimizing data management: the power of data agility for time, cost, and energy savings

Superior Data Performance: Rulex Outperforms Pandas

Business rule engine: who rules the rules?

Why Rulex Lite? An interview with Rulex’s CEO, Marco Muselli