Table 1

Detailed scoring criteria per feature

Feature treeScore
Data ingestion and integrationData consolidationConnectivity to N data sources5
Data extraction, transformation and loading (ETL) and ETL support5
Data modelling5
Data propagationData flow orchestration, enterprise application integration, exchange of messages and transactions5
Enterprise data replication, transfer large amounts of data between databases5
Versioning and file management5
Data virtualisationData access5
Data federationEnterprise information integration5
| Total 40
Data preparation and cleaningParsing and standardisationTagging data with keywords, descriptions or categories5
(Data scrubbing)/cleansing/handling blank values/reformatting values/threshold checking5
Data enhancement/enrichment/curation5
Natural language processing5
Address validation/geocoding5
Master data management5
Data masking5
Identity resolution, linkage, merging and consolidationData deduping5
Machine learning/training a statistical model5
Data aggregation5
Data binning5
Grouping similar data/clustering5
Outlier detection and removal5
Master reference data management‘Hub’ infrastructure to source and distribute master/reference data5
Master data versioning based on data history and timelines5
Workflow integrations to steward and publish the master/reference data5
Graph data stores to define relationships for creating a flexible knowledge graph5
Accessible API (Application Programming Interface) for real-time access to shared reference data5
| Total 90
Data profiling, exploration/ pattern detectionRelationship discoveryCross-table redundancy analysis5
Performing data quality assessment, risk of performing joins on the data5
Identifying distributions, key candidates, foreign-key candidates, functional dependencies, embedded value dependencies and performing intertable analysis.5
Content discoveryData pattern discovery5
Domain analysis5
Discovering metadata and assessing its accuracy5
Structure discoveryColumn value frequency analysis and statistics, collecting descriptive statistics like min, max, count and sum.5
Table structure analysis, collecting data types, length and recurring patterns.5
Drill-through analysis5
| Total 45
Data monitoringMonitoring and alertingTime series data identified and collection by metric name and key/value pairs5
Flexible query language to leverage this dimensionality5
Graphing and dashboarding support5
| Total 15
Data useMetadata managementConcept identification and naming5
Data categorisation5
Lineage5
Relationship with other metadata5
Comments and remarks5
Data statistics (profiles)5
Knowledge graph5
Privacy and securityData anonymisation5
Role based access control5
Secure environment setup and deployment5
Container-based deployment5
Data miningInteractive data visualisation5
Visual programming and analysis5
Visual illustrations and training documentation5
Sample data/generate fake data5
Add-ons and extension functionality5