Importing 20 Million Records

Dan Ridinger dlr at futurechalk.com
Wed Jun 30 16:58:57 UTC 2021


Hello Das,

Graeme is correct in to get throughput when using “insert into”  one needs to be cognizant of the length of transaction. Depending on the database this will vary. In sql server that magic number is about 500. In postgresql it is around 1000 and oracle is about 1000. The reason for that this number is so important is that the overhead of managing the transaction is very high. It is the most expensive activity that goes on in a database, as a result the consumption of resources increases as more transactions are created, that is why the processing of records slows downs when the database reaches that critical mass number of 500 or 1000. Since SQLite is a less complex database one might be able to use a number of 5000 and not have as much performance degradation. This requires trial and error testing to find the sweet spot. 

The other thing that should be mentioned is when your loading that volume of records is to remove all indexes before loading. When the data is loaded re-add the index definitions. Maintaining indexes while loading large datasets will also degrade the load speed due to maintaining the index trees. Doing the index load at the end will execute much faster and a more optimal index tree will result. 

Review the sqlite command line options espcially the dot commands.


Dan Ridinger
Managing Director




FutureChalk Software Inc.			
20521 92A Avenue						
Langley, BC  V1M 1B7
					
Phone No: 604.723.6837
EMail: dlr at futurechalk.com
www: www.futurechalk.com

> On Jun 30, 2021, at 2:45 AM, Graeme Whiting via omnisdev-en <omnisdev-en at lists.omnis-dev.com> wrote:
> 
> Hi,
> 
> I would do this with multiple SQL "insert into" in a single SQL transaction. I recently did this for an import of a million rows and it was hitting over 1,000 record per second running it from home against an AWS hosted oracle database.
> 
> Basically using Begin statement/ Sta: / End statement with lots of "insert into" within it, and then executing every e.g. 1,000 records.
> 
> Graeme.
> 
> 
> -----Original Message-----
> From: omnisdev-en <omnisdev-en-bounces at lists.omnis-dev.com> On Behalf Of Das Goravani
> Sent: 25 June 2021 17:38
> To: OmnisDev List - English <omnisdev-en at lists.omnis-dev.com>
> Subject: Importing 20 Million Records
> 
> [EXTERNAL] This is an external email. Please treat links and attachments with caution.
> 
> Dear Friends,
> 
> In OMnis I am familiar with importing, but never 20 million records.
> 
> If you have a huge file with 20 million records in it, how do you import that?
> 
> Omnis works with a list.. you import to a list.. Omnis can handle between 1 and 2 million rows.. from my experience
> 
> Do you cut up the file.. do you get some utility for that?
> 
> Because you can’t open such a file in any program like Word or Excel or Numbers or Pages
> 
> So you’d need some utility meant for cutting up large files
> 
> Or is there a way to tell Omnis to only import part, then next part, then next.. I’ve never seen that.
> 
> Thanks
> 
> Das Goravani
> _____________________________________________________________
> Manage your list subscriptions at http://lists.omnis-dev.com Start a new message -> mailto:omnisdev-en at lists.omnis-dev.com
> _____________________________________________________________
> Manage your list subscriptions at http://lists.omnis-dev.com
> Start a new message -> mailto:omnisdev-en at lists.omnis-dev.com 



More information about the omnisdev-en mailing list