Data Migration Case Studies

Data/Database Migration Case Studies
Client's Industry Client's Objective Our Service Delivery / Solution
Manufacturing Our client was headquartered in Sweden and manufactured large mining equipment. The had 8 different sites across the world that produced technical documentation (eg, Parts Catalogs, Technical Service Manuals, Safety Bulletins etc), and most of the sites were using different legacy systems. Our client purchased a new SaaS system for all of their technical documentation, and required all of their legacy technical information to be migrated into the new SaaS system. Our client used various legacy systems, including LinkOne, Catbase, SparePort2, and some home grown systems. Our solution involved us writing enterprise level custom migration software that could ingest the data from each of these various legacy systems, analyze it, convert it according to our client's specific business rules (which could vary across sites or regions), and ultimately produce XML and SVG files that could be published into the target SaaS system. We also wrote tools to integrate with the SaaS system's APIs in order to bulk publish all of the converted content.

In total, we migrated:
  • 1,000,000+ parts
  • 300,000+ Engineering Bills of Material (BOMs), aka assemblies
  • 100,000+ Engineering Diagrams
  • 15,000+ Parts Catalog Manuals
  • 100,000+ Related Technical Manuals (TSMs, Service Manuals, Safety Bulletins, etc)
Non-Profit Our client works with non-profits, and just recently the IRS made all IRS Form 990s publicly available via large CSV files (the listing), and all of the actual Non-Profit Form 990's by company by year available via an XML API. Our client tasked us with ingesting all of this new IRS data into a brand new MySQL database, so that the data could be much more easily searched and analyzed (since the IRS provided only the lists, and the forms as XML, with no easy way to analyze or search the data). Because the Form 990 data was in XML format, and the schema for these XML documents has changed from year to year, and from version to version, it made this task quite challenging. Rather than trying to write a custom conversion plugin for each of the schemas, we instead wrote a single, but completely dynamic, tool that could handle all of the forms and ingest all of the data into a database, regardless of which IRS Form 990 Schema it came from.

In total, we migrated:
  • 1,000,000+ Index Entries
  • 1,000,000+ Form 990 Documents
In addition, we created a web application to search through the non-profit data and provide useful reports about non-profits over the last 20 years (eg, by income, location, industry, etc).
Legal Our client had a website with a large backend database hosted with one hosting provider, and was looking to have all of their content (their website, database, email sub-system, forms, documents etc) migrated over to a new host. Our solution was to develop a plan to migrate all of their content and data, without incurring any data loss during the migration period, and with as little downtime as possible. We ensured that we were able to accurately aggregate all of their content, files and artifacts, as well as all of the data and records in their database.

In total, we migrated:
  • 100,000+ files and documents
  • 55,000+ records

Client's Industry Client's Objective Our Service Delivery / Solution
Construction Our client had an Asset Management system, but for their worldwide fleet of construction equipment, the AM system was not the system of record (although it should have been). Their objective was for us to accurately gather all of the distinct records of their fleet, as well as the assemblies which make up each machine, which spanned multiple product lines across multiple continents and manufacturing centers, and merge them into one maintainable location and to serve as their system of record (source of truth for their fleet). The challenge in this project was that all of the data came from secondary and tertiary systems, often times not even from any kind of Asset Management system, but rather from their Technical Publications groups who produced technical documentation for the equipment that each publisher's site manufactured. Our solution involved successfully decrypting, ingesting and analyzing data across multiple systems, and aggregating a single database for their worldwide fleet.

In total, we:
  • analyzed records from 4 different documentation systems
  • ingested over 8TB of data and over 18 million files
  • accurately migrated a fleet of 10,000+ records
  • accurately migrated 150,000+ engineering assembly records
E-Commerce Our client manufactures reed diffusers for large retail chains throughout the United States, eg Walmart, Kohl's, Sears, K-Mart etc. While a large portion of their business was selling in bulk to large retail chains, they also wanted to offer B2C sales via an e-commerce website. All of their product records were stored in their ERP MAS90 system, and their objective was to migrate these records into a new eCommerce / Website Database Schema to make product listings available via the web. In addition to migrating the records, we were also responsible for developing a new database schema to store their products.

In total, we:
  • migrated 500+ product records
  • migrated 1000+ picture records
  • developed a custom database schema for their records

Client's Industry Client's Objective Our Service Delivery / Solution
Mining Industry Our client had over 40,000 engineering diagrams in CGM format. They hired a third party company to convert those CGM graphics into SVG format, but the third party converted the images without successfully quality assuring that the results are all matching correctly. The issue was that some of the CGMs had data within the file that transformed the images (a rotation transformation), and this was lost when converting to SVG. Our client approached us and asked if we could figure out a way to fix the orientation of the images. The challenge in this project was that not all of the images needed to be landscape or portrait. Some of the images were correctly portrait, and others were accurately landscape. The challenge was finding those images that were incorrectly one orientation, when they should have been a different orientation. To solve this, we wrote a custom algorithm to read the SVG data, ingest and understand each image's matrix transformations, and wrote some reports against the data. We were able to decipher which conditions determined when an image needed to be rotated 90 degrees, and whether that rotation should be clockwise or counter clockwise.

In total, we:
  • ingested over 50GB of image data
  • resolved all incorrect orientations for the images, on a set of 40,000+ SVG engineering diagrams
Original Equipment Manufacturer (OEM) Our client is based in France and manufactures large equipment. They were upgrading their publishing system, and in doing so they had to convert over 100,000 images into PNG format, and have them migrated into a new system. Our solution was fairly straightforward - we converted all of their images/diagrams into PNG format, quality assured the results, and then wrote a custom tool to migrate those images into their new system.

In total, we:
  • converted 100,000+ images
  • migrated 100,000+ images into their new system

Client's Industry Client's Objective Our Service Delivery / Solution
Original Equipment Manufacturer (OEM) Our client was a large OEM company using a legacy system called LinkOne. All of their Parts Catalogs were initially created and published in the legacy LinkOne system. The client purchased a subscription to a new Software as a Service (SaaS) platform that had lots of new and improved features, including making all of their parts catalogs and parts available online and accessible across their distribution and dealer networks. The objective was to accurately gather all of the legacy content, and successfully migrate it into the new system. One of the key challenges in this legacy system data migration was that the legacy system did not have a backend database; all of the content was stored in folders on a file system. This produces a uniqueness challenge, whereby assembly files can exist in different folders, with the exact same filename, but yet the assembly content could actually be different; eg, a different version or revision of the assembly. To clarify, /some/folder/1101.ldf could be entirely different from /another/folders/1101.ldf, even though they shared the same filename in different directories. To solve this, we used a special hashing algorithm to hash the contents of each file, and to determine when an assembly was actually unique. The end result was an accurate migration of unique content into the new system, without extra fluff or duplication.

In total, we migrated:
  • 100,000+ List Definition Files
  • 40,000+ unique assemblies
  • 50,000+ unique engineering diagrams/images
Drilling Equipment Supplier (Oil/Gas) Our client is one of the world's leading provides of drilling services, equipment and performance tooling for mining and drilling companies. Their after market parts manuals had been created and published in a legacy system named Catbase. They too purchased a new software solution for their after market parts catalogs, and their objective was to extract all of their existing content, and have it successfully migrated into their new online system. A challenge in this project was trying to accurately extract the data from the legacy system. The developer of the Catbase program implemented their own type of 'encryption' algorithm to make migrating data out of the system difficult. However, it turns out the files were not truly encrypted; instead, it was more an 'encrypted by obscurity' method. Once we discovered a method for accurately extracting their data, we were able to successfully migrate all of their parts catalogs into their new system.

In total, we migrated:
  • 1,000+ technical and parts manuals
  • 10,000+ unique engineering assemblies
  • 40,000+ unique engineering parts

Client's Industry Client's Objective Our Service Delivery / Solution
Construction Our client used Framemaker to generate many of their technical / parts manuals. Although at the time they considered Framemaker good for generating content, it certainly lacks from a structured data perspective and all of the data exists within Framemaker's propriety file format. Our client had thousands of Framemaker MIF files, and needed a way to report on the data, extract the data into a more useful format, and ultimately migrate the data into a new system. Our solution was to write a custom algorithm to parse MIF files, looking for specific types of data that our client was interested in. MIF files can be difficult to work with, as they are not structured like an XML file; in fact, they're more closely related to a Portable Document Format (PDF), but the way references and links are handled can be even trickier. We essentially reverse engineered / deconstructed the way in which their data was handled within MIFs, wrote an algorithm to parse the data, and ingested all of the meaningful data into a custom database. From there, we were able to easily present reports to the client, including answers to key questions like: how many unique assemblies are there, or how many cases of a certain exception exist, or where do we think there are typos in a certain field which does not conform to a field's intended format. Answering these quesetions and more via custom reports proved to be a key ingredient in the overall success of the data migration.

In total, we:
  • ingested 10,000+ MIF files
  • provided 2+ dozen custom reports that answered key questions about the data
Non-Profit Our client is a service provider for non-profits. Their web application offers users the ability to do fundraising registration directly within their web application. While our client offered their customers a way to do this via forms, the web application was missing a reporting module to allow their customers to execute and render reports against the data they've entered into the system. Our client requested a custom reporting tool that would allow their end users to run reports against their data. In this particular case, our solution was to write a custom reporting module / tool that would afford their customers the ability to run custom reports against the data they entered into the system. The solution was dynamic, and allowed for various properties and fields to be reported on and ultimately the reports would be presented to the user in a clean and simple manner.

Client's Industry Client's Objective Our Service Delivery / Solution
Non-Profit Our client found out that having an expert service provider capable of providing ad-hoc data analysis and reporting, whenever needed, would help them make better business decisions. As such, they retained us and we continue to provide on-going expert assistance whenever they need insights into any type of question about the data within their database. We continue to provide on-going analysis whenever it is requested. Our solution is always based on understanding what is the exact question or problem, and what data or information is required to solve the problem or question. We then begin to ingest the data, analyze it, interpret it, and present the results to our client.
Engineering Our client had tens of thousands of engineering diagrams, and they wanted to know how often a certain condition is present within their drawings. Generally, images would not be considered easily analyzable data. Most people would be familiar with raster graphics (BMP, GIF, JPG, etc), but they lose quality as the size of the image is increased. In order to retain the ability to scale a type of image or drawing, it is better to create the image as a vector graphic, rather than a raster graphic. Most modern engineers will design using CAD systems, and CAD drawings can be exported to a vector format, such as SVG (Scalable Vector Graphics), which retains the mathematic data that makes up the drawing. A vector image is made up of math (lines, curves, matrix transformations, etc), rather than colored pixels. As a result, we were able to ingest all of their vector image data into a database and perform an expert analysis of the data via complex modeling and queries. This allowed us to locate certain conditions, and provide an expert analysis that would answer the questions they had about which drawings were comprised of certain conditions. This was a unique case of data analysis, but proved to be effective at answering the questions posed by our client.