Join 107,422 C# Programmers for FREE! Ask your question and get quick answers from experts. There are 1,194 online right now! We've got more than 500 tutorials and 2,000 snippets. Join and find out why Dream.In.Code is the #1 programming help community on the internet! Registration is fast and FREE... Join Now!
A regular expression that I have come up with for field lengths is below But I think I can Tighten this up later.
CODE
Dim matchpattern as String = "^[^,]{0,9},[^,]{0,10},[^,]{0,15},[^,]{0,19},[^,]{0,10},(?:[^,]{0,36},){5}[^,]{0,16},(?:[^,]{0,36},){2}[^,]{0,16}(?:\r\n|$)"
What I need help with is the general process of the application..
ie: Check all files in c:\test If a csv File matches regular expression for Amount of Fields ( 14 ) and no Field exceeds its limit then Save file to c:\test\good IF a CSV File Fails to match the expression then save to c:\test\bad. Remove csv file from c:\test
Your example (c#) is very close to what I need. Could you help me with the overcall code ?
I am glad you liked my blog article on regular expressions. I am always flattered when I can help someone out through the blog.
As for your problem, you are going to love me I am sure. I have taken my example from the blog and modified it to meet your criteria. Some of the changes I have made are...
1.) Made my main function loop through files of the given directory looking for csv files (c:\Test) 2.) Changed my class from using Run to now using TestFile and it returns a boolean (true/false) as to whether or not it passed our test. 3.) Used the return value from our new function to then copy the files to the given folder based on the result. If it returns true, it is a good file. If it returns false it is a bad file.
While I have brought you 99% of the way there, I always love to leave 1% as homework for the user. You will notice I did not change the pattern or the field count. I will leave that up to you. I have also made the copy function to copy the files, but did not remove them from the directory. I am sure you can figure that part out yourself too. You can keep my copy in place or choose to use a function like "move()". Up to you.
So here is the code fully documented as usual...
csharp
using System; using System.IO;
// Notice the namespace for using regular expressions. .NET has an entire namespace dedicated to the topic. using System.Text.RegularExpressions;
namespace experimentalconsole { class Program { static void Main(string[] args) { // Create an instance of our class and run it checkCSV csv = new checkCSV();
// Get the files in the given directory matching the csv extension String[] Filenames = Directory.GetFiles("c:\\Test","*.csv");
// Loop through all the files returned from that directory foreach (String filename in Filenames) { // Check to make sure each exist if (File.Exists(filename)) { // Run it through our test bool valid = csv.TestFile(filename);
// Pull off the filename from the path String basename = Path.GetFileName(filename);
// If the file passed testing, write to screen it was valid // Then copy it to the good folder under c:\\Test if (valid) { Console.WriteLine(filename + " is valid");
File.Copy(filename, "c:\\Test\\good\\" + basename); } else { // Failed test, move it to the bad folder under c:\\Test Console.WriteLine(filename + " is invalid"); File.Copy(filename, "c:\\Test\\bad\\" + basename); } } } }
}
class checkCSV { // Lets setup our pattern first private static string csvRegexPattern = @"([^,""]+|""([^""]|"""")*""|,,)";
// Lets now setup a Regex object using the pattern // This creates a Regex object using the pattern passed as a parameter to the constructor private static Regex _Regex = new Regex(csvRegexPattern);
// This method replaced Run in the previous example and now returns a true or false // to determine if passed our tests public bool TestFile(string AFile) { String lineRead; int lineNumber = 0;
try { StreamReader sr = new StreamReader(AFile);
// Loop through the file, reading each line while (null != (lineRead = sr.ReadLine())) { // Increment the line number lineNumber++;
if (_Regex.Matches(lineRead).Count != 5) { // Return false because it didn't match our field count of 5 // Change this to 14 and use your pattern instead of mine. return false; } } } catch (Exception e) { // If there was an error, lets see what it was, return false. Console.WriteLine("File could not be found: {0}", e.Message.ToString()); return false; }
// Must have passed our test, return true. return true; } } }
Read through the in code comments to see what has been changed and what you need to do where. After you make your few changes, you will have something working within about ohhhh 15 minutes.
Enjoy and thanks for reading my blog!
"At DIC we be regular expression blogging code ninjas!"
Thank you so much Marty for you help. I will review the code and Test over the next few days. I will then let you know how I get on.. I take it that this is the best place to communicate rather than email. I will be in touch....Love is such a strong word but I really do appretiate the help.....!!!!!! Ray..(Ireland)
Hiya Marty, I have been testing away with the code.. I have a regular expression that passes my csv when I test it using a regular expression designer.. The full expression is at the very end of this reply
But When I run your code I constantly get file is invalid... So I stripped back my file to just one field and tested - Passed. Then using 2 fields I tested - Failed ..
ie:
CSV File just Contains 0067 private static string csvRegexPattern = @"(^[^,]{0,9})"; if (_Regex.Matches(lineRead).Count != 1) NoError ------> Valid
CSV File just contains 0067,medium private static string csvRegexPattern = @"(^[^,]{0,9},[^,]{0,10})"; if (_Regex.Matches(lineRead).Count != 2) Error ---> Invalid ??
Also if CSV File just Contains 0067123456789 private static string csvRegexPattern = @"(^[^,]{0,9})"; if (_Regex.Matches(lineRead).Count != 1) NoError ------> Valid But this should fail because it is longer than 9 characters...
Anymore help please ? Ray..
P.S The full expression that matches in my regex designer is @"(^[^,]{0,9},[^,]{0,10},[^,]{0,15},[^,]{0,19},[^,]{0,10},(?:[^,]{0,36},){5}[^,]{0,16},(?:[^,]{0,36},){2}[^,\r\n]{0,16})";
Hiya, If I leave the count set to 1 then everything works fine
CODE
if (_Regex.Matches(lineRead).Count != 1) { // Return false because it didn't match our field count of 5 // Change this to 14 and use your pattern instead of mine.
This doesn't really work. If I leave this at 1 then the checker reports valid invalid correctly even on files that contain multiple lines. There was no real need for this area to count number of fields because if each line doesn't match the expression exactly then it will fail. Therefore there was no need to count the fields.. ??