antFarm is a lift 'n' shift data migration solution that supports both cloud and on-premises data sources. For this reason, it is opening paths to IT modernization by bringing in various benefits, including reduced costs, improved performance and the resiliency of cloud architecture.
antFarm solves many challenges of quickly reading data sources and generating files that are optimized for the fastest data load.
Architecture
antFarm is composed of two main components:
Execution engine installed on a local environment. This is the place where the ant colony resides and works to process data as fast as possible.
Central meta data repository that is created on the target destination. This is what we call the Queen Ant residence, where she can find everything she needs to successfully manage working ants.
High level architecture
CLOUD ENVIRONMENT
LOCAL ENVIRONMENT
antFarm supports different target destinations
For bigger enterprises or group of companies, there is a possibility to have distributed execution engines with single central repository.
Scalable parallel execution
How does antFarm work?
After you have installed antFarm, the first step is to establish connections to your data sources and to the target destination. antFarm uses named connection strings to databases and filesystems.
AntFarm automatically:
To define the data load execution logic, you first need to configure the data processing flow. Data processing flow is composed of different steps, e.g. extract, truncate, load and process. You can define as many steps as you like.
In addition, we have a parametrized workflow that takes care of the process standardization and dictates data processing flow behaviour.
Within the workflow you assign various operations to each step. In general, there are two types of operations:
For example, a load step could be composed of put and copy operations.
During the data processing flow each step is assigned one or multiple number of workers. Workers, we call them ants, are related to hardware resources. The more resources you have, the more workers can be activated, all resulting in faster data processing.
Based on meta data defined in the central repository for each data processing flow step, separate queues are populated with tasks. Each task is than processed by the working ant according to the workflow settings.
Optimization and table partitioning
antFarm updates queued tasks with start and end times. Gathered data is available in predefined reports. One of them displays loads execution times.
If you are not satisfied with execution times, one thing you can do is table partitioning. This way antFarm generates multiple tasks and extracted CSV files for a single table in order to achieve the best data load performance (parallel execution).
An additional option is, as mentioned, to scale the hardware.
Benefits and Key Features
Easy to use
Whole data movement processing is defined with standard SQL syntax.
Bulk loads
Data is imported into the destination in batches.
Parallel execution
antFarm was developed for speed and efficiency.
Scalability
The more hardware resources you add, the faster execution you get.
Logging
Detailed configurable logging and reporting are available out-of-the box.
Serial execution
Streaming for processing and synchronization.
Automation
Target tables, data types conversion and ETL queries are automatically generated.
Completely open solution
antFarm can be integrated into any data integration tool.
Any custom process
You can run any kind of SQL or Python processes (operations).
Out-of-the box support
We’re constantly growing the list of supported data sources and target destinations. If you need to access data from a source which isn’t currently supported, please get in touch. antFarm can be easily extended.
data sources
data warehouse destinations
References
Do you need more than the lift 'n' shift data migration solution?
Check out our data warehouse automation solution DataMerlin that makes data warehouse implementation 10-times faster and related TCO 10-times lower.