c# - Efficient way to tokenize conditionally -


given user input string:

"mainframes/pl/ sql; software testing/pl/sql/project management/"

what way tokenize string such '/' retained if part of "pl/ sql", not otherwise, giving tokens:

"mainframes", "pl/ sql", "software testing", "pl/sql", "project management"

this because users may accidentally enter '/' character separator.

if order of tokens isn't important might work:

public ienumerable<string> tokenise() {     var input = "mainframes/pl/ sql; software testing/pl/sql/project management/";     var results = new list<string>();      foreach (match match in regex.matches(input, @"pl\s*/\s*sql", regexoptions.ignorecase))     {         results.add(match.value);     }      input = regex.replace(input, @"pl\s*/\s*sql", string.empty, regexoptions.ignorecase);      results.addrange(input.split(new []{'/'}, stringsplitoptions.removeemptyentries));      return results; } 

this starts searching pl/sql tokens (accounting differences in whitespace , capitalisation) strips them out of input string , performs simple split on remaining '/' characters. downside order of tokens different input string.


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -