Quantcast
Channel: VBForums
Viewing all articles
Browse latest Browse all 42220

[RESOLVED] StreamReader readline() split only at newlines that are not encapsulated in quotes

$
0
0
I have the following function
Code:

        /// <summary>
        /// Pulls info from CSV file and stores each entry as list of string arrays
        /// </summary>
        /// <param name="path"></param>
        /// <returns></returns>
        public static List<string[]> parseCSV(string path)
        {
            //
            List<string[]> parsedData = new List<string[]>();

            try
            {
                using (StreamReader readFile = new StreamReader(path))
                {
                    string line;
                    string[] row;
                    string pattern = ",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))";  //Should be commas that are not encapsulated in quotation marks
                    Regex r = new Regex(pattern);
                    while ((line = readFile.ReadLine()) != null)
                    {
                        row = r.Split(line);
                        parsedData.Add(row);
                    }
                }
            }
            catch (Exception e)
            {
                MessageBox.Show(e.Message);
                CommitSuicide();
            }

            return parsedData;
        }

which worked well in the past, but now the company who is sending us the CSV has added a new field which sometimes has newlines (\r\n). This function will no longer support us because (line = readFile.ReadLine()) splits at each newline.

What is the best way to modify the existing function to only split at newlines that aren't enclosed in double quotes? I suppose I could create a StreamReader extension and call it ReadEntry and basically recreate what ReadLine already does... but that sounds rather tedious and out of my skill level, to be honest.

Viewing all articles
Browse latest Browse all 42220

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>