Best approach to avoid memory explosion during batch importing

Hello Forum,

please excuse me if I write here, but I think you're the best to answer my question... I've to read from a CSV file (that can be up to 3 GB) 

My idea is to read the file (which is on azure, but it it's not relevant) as as stream then process it without creating new objects in memory, here's my actual approach

 

 

 public void Copy(Stream stream)
{
string configContent = File.ReadAllText("schema.json");

SchemaConfig schemaConfig = JsonConvert.DeserializeObject<SchemaConfig>(configContent);
DataTable sourceData = PrepareDataTable();

string sqlServerConnection = ConfigurationManager.AppSettings.Get("server:sql");

SqlBulkCopy bcp = new SqlBulkCopy(sqlServerConnection);

bcp.DestinationTableName = schemaConfig.DestinationTable;

using (TextReader reader = new StreamReader(stream))
{
var lines = reader.Lines(rowToSkip: schemaConfig.RowsToSkip);
}

}


Where Lines is a simple extension method as

 

 public static IEnumerable<string> Lines(this TextReader reader,int rowToSkip=0)
{
//This is to skip header's row
for (int i = 0; i <= rowToSkip;i++)
{
reader.ReadLine();
}

string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}


Now since I've to perform the insert my idea was to create a class that implements IDataReader and to read
the line at request but its' I've to perform a Split() (which allocates memory)

Is there a way I can iterate through the string wiuthout having memory allocated?

Excuse me again if I've used this forum but I think you're the most skilled I know

Thanks





1 comment
Comment actions Permalink
reader.ReadLine();

will allocate a string even if you don't keep it in memory, it will produce memory traffic. Please read this blogpost for details.
As a quick answer I would recommend to analyze a stream manually byte by byte, or even write your own implementation of stream, if standard one allocates too much memory (needs learning implementation of standard Stream).
Or, much better, ask your question on https://stackoverflow.com/, it is the right place for such questions.

0

Please sign in to leave a comment.