Agile Import process performance
We have scenario where user need lot of data import across geographies.
when i say, Data it is like 2,3 six figure item and boms.
is it worthy to give node based url of cluster for import or import through load balancer
what are the factors that need to be taken care in system configuration to provide capability for users to load data parallel in three geographies.
Downtime is common for all three geographies.
What do you mean geographies? Where are your users in relation to the application servers?
User will need import privilege and probably discover / read / create /modify for the object you want them to import.
If you have 100’s of thousands of items and BOM to load, you are far better off using DataLoad. It is more efficient and also allows the data to be validated before you try to load it. It also does not go through the application server but talks directly to the database server. But you would also be best served by running DataLoad during off-hours instead of during the middle of the day, as you also have to run a number of scripts afterwards that will burden the database server significantly. Then again, DataLoad is not a customer-facing tool, so you might have to engage the services of a consultant who is experienced in using it, to assist *you* in using it.
Note that Import runs on the local server of the user that starts it, but it does a LOT of talking to the application and database servers. Load balancing of the application and database servers will take care of them, but you might still be constrained by the performance of the local desktop/laptop of the user running Import. If you use DataLoad, you can collect all the data together, process it and get it loaded pretty quickly, all from one server that can be dedicated to that task. Again, using DataLoad is quite different from using Import, and so the planning in each case will be quite different.
If you will have users only importing 10s-100s of items and BOM at a time, Import is fine. As Adrian noted, make sure that users have correct privileges (you may want to create an Import role and assign it all the correct privileges). Having them widely separated won’t make a difference. I assume that your application and database servers are already using load-balancing, so other than making sure network bandwidth is good, there is no reason to assign a specific cluster of nodes for Import.
Thanks for response, Data Load is ruled out options as the busienss is very dynamic.
The reason for parallel geographies is, due to nature of organization structures. each geographic team has a proxy url to identify traffic. Since, each team located in geographies manage set of site specific data. Due to need for migration, management wants to leverage same team specific to site for data extraction from source and use import feature to target.
AUT and Other means are ruled out due to mamoth size of data, 500 plus gb and file attachent is excluded in it.
To accellerate, have identified AIS import automation. but also, want to be check feasibility, for webservices import.
Basically, i want agile AIS provided capability to validate and import data like the one used in import/export feature, it refers to preference and business rules.
Given the volume of data, I agree with Kevin’s approach. What else could be faster than a direct DB hit… The downtime can’t be weeks unless the business agrees.
The import process should be incremental/divided per site rather than a big band approach due to the extraordinary amount of data.