Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Database > Database APIs

    DataCleaner 2.5.2

    Download button

    No screenshots available
    Downloads: 1,156  Tell us about an update
    User Rating:
    Rated by:
    Good (3.3/5)
    20 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Kasper Srensen | More programs
    LGPL / FREE
    May 1st, 2012, 06:28 GMT [view history]
    ROOT / Database / Database APIs

     Read user reviews (0)  Refer to a friend  Subscribe

    DataCleaner description

    DataCleaner is a solution for businesses and organizations wishing to measure and increase the quality of their data.

    DataCleaner is a solution for businesses and organizations wishing to measure and increase the quality of their data. DataCleaner includes functionality to profile and compare data, to validate data against business rules, and to monitor the progression of these measurements over time. It includes both a standalone desktop application for exploring and defining the data quality effort and a Web application for continuous data quality deployments.

    The field of data quality has been overseen in the IT-business for a long time. Organizations are beginning to feel the pain from inconsistent and flawed systems and interest is being built to support the employment of more ambitious goals as to the quality of our data. To illustrate the concept, let's imagine an information chain typically centered around the process of building a data warehouse. Even though you can employ data quality principles in lots of other scenarios, the data warehouse is the archetypical situation since the data warehouse seeks to create a single version of the truth - and obviously this truth has to be of high quality!

    The DataCleaner project is a project aimed for working seriously and ambitiously to create a framework for data quality. The goal of DataCleaner in the example above is to help the data warehouse professional to understand the source systems he is working with better and apply the logic of this understanding to both the input and output of his process. This way we ensure our data's quality. Some might even say that we test the data warehouse, just as we test our products for flaws and our software for bugs.

    So in short...

    DataCleaner is a data quality component, application and monitor for profiling, validating and comparing data.

    Known issues:
    DataCleaner cannot read CSV files with missing values or blank lines. Comma's should be represented on every line, consistently with the CSV-format.


    Product's homepage

    What's New in This Release: [ read full changelog ]

    Apache CouchDB support:

    · We've added support for the NoSQL database Apache CouchDB. DataCleaner supports both reading from, analyzing and writing to your CouchDB instances.

    Update table writer:

    · Following our previous efforts to bring ETLightweight-style features into DataCleaner, we've added a writer which updates records in a table. You can use this for example to insert or update records based on specific conditions.

    · Like the Insert into table writer, the new DataCleaner Update table writer is not restricted to SQL-based databases, but any datastore type which supports writing (currently relational databases, CSV files, Excel spreadsheets, MongoDB databases and MongoDB databases), but the semantics are the same as with a traditional UPDATE TABLE statement in SQL.

    Drill-to-detail information saved in result files:

    · When using the Save result feature of DataCleaner 2.5, some users experienced that their drill-to-detail information was lost. In DataCleaner 2.5.2 we now also persist this information, making your DQ archives much more valuable when investigating historic data incidents.

    Improved EasyDQ error handling:

    · The EasyDQ components have been improved in terms of error handling. If a momentary network issue occurs or another similar issue causes a few records to fail, the EasyDQ components will now gracefully recover and most importantly - your batch work will prevail even in spite of errors.

    Table mapping for NoSQL datastores:

    · Since CouchDB and MongoDB are not table based, but have a more dynamic structure we provide two approaches to working with them: The default, which is to let DataCleaner autodetect a table structure, and the advanced which allows you to manually specify your desired table structure. Previously the advanced option was only available through XML configuration, but now the user interface contains appropriate dialogs for doing this directly in the application.

      


    TAGS:

    data quality component | profiling monitor | data validation | data | quality | component



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM