MS Access Forum / Queries / June 2007
Isolate Patterns
|
|
Thread rating:  |
EllenM - 26 Jun 2007 13:43 GMT Hello, I'd like to isolate patterns such as 2A3672, 3B3696 and 6T3930. I can find records with this pattern with the following query:
SELECT InventoryOct2006.Originator FROM InventoryOct2006 WHERE (((InventoryOct2006.Originator) Like "*#?####*"));
I'd like to write a query to give me only the data that fits the pattern, for instance 2A3672, and omit all the other stuff in that cell. Placing "#?####" in the criteria doesn't seem to do it.
Thanks in advance for your help.
Ellen
Michel Walsh - 26 Jun 2007 16:24 GMT Have you tried:
LIKE "*[0-9]?[0-9][0-9][0-9][0-9]*"
Hoping it may help, Vanderghast, Access MVP.
> Hello, > I'd like to isolate patterns such as 2A3672, 3B3696 and 6T3930. I can [quoted text clipped - 12 lines] > > Ellen EllenM - 26 Jun 2007 16:49 GMT Thanks, Michel. That finds the whole string in the cell. I want to be able to parse out patterns such as 3B1234 or 4C2345.
> Have you tried: > [quoted text clipped - 19 lines] > > > > Ellen John Spencer - 26 Jun 2007 17:10 GMT So you are saying that you want to see only the data that matches within the field?
Example Field Value: "This string is 2A3672 and xxxxx" The returned value should only show "2A3672" and discard all the verbiage around it.
IF that is the case, I would think that you would need some fairly complex vba code to handle and/or the use of John Nurick's reg expression code.
http://www.mvps.org/access/modules/mdl0063.htm
Or check out http://www.j.nurick.dial.pipex.com/Code/vbRegex/rgxExtract.htm
 Signature John Spencer Access MVP 2002-2005, 2007 Center for Health Program Development and Management University of Maryland Baltimore County .
> Hello, > I'd like to isolate patterns such as 2A3672, 3B3696 and 6T3930. I can [quoted text clipped - 12 lines] > > Ellen EllenM - 26 Jun 2007 19:04 GMT Hi John, My coworker and I have poured over the second link. We've had success with doing immediate pane exercise. The code that worked is:
? rgxExtract("blah blah FAP 3b3333 blah blah", "\w+\s\d\w\d{4}", 0)
The only glitch is that my string contains a carriage return, so that it looks like:'\
? rgxExtract("Abbott Laboratories; FAP 2A3672", "\n*\w+\s\d\w\d{4}", 0)
The vb editor interpreted that script as two lines and therefore it didn't work.
Is there a way for me to strip out the carriage returns from my string prior to passing it into the function?
How do I apply this function to my table outside the vb editor? BTW, the pattern only exists once in a given string.
Thanks so much for your help.
Ellen
> So you are saying that you want to see only the data that matches within the > field? [quoted text clipped - 27 lines] > > > > Ellen John Spencer - 26 Jun 2007 20:45 GMT You could use the VBA replace function (Access 2000 or later)
rgxExtract(Replace([The field],Chr(13) & Chr(10)," ") ,"\w+\s\d\w\d{4}",0)
In a query, you could do something like
Field: Stripped: rgxExtract(Replace([TheField],Chr(13) & Chr(10)," ") ,"\w+\s\d\w\d{4}",0) Criteria: Not null or <> ""
I would probably use an additional column and apply criteria to it on the theory that SQL engine is going to weed things down for me before I do the rgxExtract. You might try it both ways and see if there is any benefit to using the like criteria versus using criteria on just the stripped result.
Field: TheField Criteria: Like "*#[a-z]####*"
Of course, one benefit of using criteria on only the "Stripped" result is that you don't have to figure out a like criteria that matches up with what you want to extract.
 Signature John Spencer Access MVP 2002-2005, 2007 Center for Health Program Development and Management University of Maryland Baltimore County .
> Hi John, > My coworker and I have poured over the second link. We've had success [quoted text clipped - 57 lines] >> > >> > Ellen EllenM - 28 Jun 2007 18:38 GMT Hi John, Could you write the script in SQL?
Thanks so much for everything. You've been most helpful.
Ellen
> You could use the VBA replace function (Access 2000 or later) > [quoted text clipped - 17 lines] > that you don't have to figure out a like criteria that matches up with what > you want to extract. John Spencer - 28 Jun 2007 20:07 GMT I'm sorry but I am not sure I understand your request.
Do you want someone to write the SQL for the query?
Did you copy John Nurick's code and put it into module?
Perhaps something like the following will work.
SELECT InventoryOct2006.Originator , rgxExtract(Replace([Originator],Chr(13) & Chr(10)," "), "\b[A-Z]{3,4}\s\d[A-Z]\d{4}",0) as FoundValue FROM InventoryOct2006 WHERE InventoryOct2006.Originator Like "*#?####*"
I am not that familar with using regular expressions and I have to play around quite a bit to get the correct results. You can try the above and see if it works. '==================================================== John Spencer Access MVP 2002-2005, 2007 Center for Health Program Development and Management University of Maryland Baltimore County '====================================================
> Hi John, > Could you write the script in SQL? [quoted text clipped - 24 lines] >> that you don't have to figure out a like criteria that matches up with what >> you want to extract. EllenM - 29 Jun 2007 12:16 GMT I'm speechless... I was kind of expecting VBA something or other that I couldn't understand. Your script was exactly was I've been looking for... a simple query that even I could do.
Many, many thanks!!!
Ellen
> I'm sorry but I am not sure I understand your request. > [quoted text clipped - 48 lines] > >> that you don't have to figure out a like criteria that matches up with what > >> you want to extract. RoyVidar - 26 Jun 2007 20:47 GMT "EllenM" <EllenM@discussions.microsoft.com> wrote in message <3166CD52-1FCA-4612-86BF-642CD83CB498@microsoft.com>:
> Hi John, > My coworker and I have poured over the second link. We've had [quoted text clipped - 54 lines] >>> >>> Ellen I don't understand your pattern. You say you want only the "3b3333", but the pattern "\w+\s\d\w\d{4}" should give you a bit more than that.
n* - zero or more new line w+ - one or more occurences of [a-zA-Z_0-9] s - whitespace
which occurs before "your pattern".
d - is numeric w - again, is [a-zA-Z_0-9] - are you sure about that, or should you rather use for instance [a-zA-Z]?
I would guess perhaps "\d[a-zA-Z]\d{4}", perhaps, to ensure it doesn't exist within other words, you could test for word boundaries, "\b\d[a-zA-Z]\d{4}\b"
Here's an attempt at a function. I'd suggest a module level variable for the regexp object, placed in the declaration section
Private mre As Object
' then the function
Public Function rvsGetText(ByVal TheString As Variant) As String
Dim mc As Object Const MyPattern As String = "\d[a-zA-Z]\d{4}"
If Not IsNull(TheString) Then If mre Is Nothing Then Set mre = CreateObject("vbscript.regexp") End If With mre .MultiLine = True .Pattern = MyPattern Set mc = .Execute(TheString) If mc.Count > 0 Then rvsGetText = mc(0) End If Set mc = Nothing End With
End If
End Function
In the query, call it like this
NewField: rvsGetText([YourField])
or in SQL
SELECT rvsGetText([YourField]) as NewField, ...
There's no errorhandling, though, and it will return ZLS if no match.
 Signature Roy-Vidar
EllenM - 27 Jun 2007 00:30 GMT Thanks for your response, Roy. Actually you're right. We're looking for a 3 or 4 string of capital letters followed by a space, then a digit,a capital letter followed by 4 digits. For examle, FAP 3B3333. I started out asking for just the last 6 characters as that happens to be unique.
And thanks, to you, John. I'll read both replies in detail tomorrow morning. My coworker (who is quite versed in VB) will be able to help me then.
Bye! Ellen
> "EllenM" <EllenM@discussions.microsoft.com> wrote in message > <3166CD52-1FCA-4612-86BF-642CD83CB498@microsoft.com>: [quoted text clipped - 113 lines] > > There's no errorhandling, though, and it will return ZLS if no match. RoyVidar - 27 Jun 2007 07:56 GMT "EllenM" <EllenM@discussions.microsoft.com> wrote in message <A1097063-AF86-4784-AFA7-D89C1D7BE389@microsoft.com>:
> Thanks for your response, Roy. Actually you're right. We're looking > for a 3 or 4 string of capital letters followed by a space, then a [quoted text clipped - 129 lines] >> There's no errorhandling, though, and it will return ZLS if no >> match. Try this pattern
"[A-Z]{3,4}\s\d[A-Z]\d{4}"
This might give wrong results when/if the initial string of three or four uppercase letters consists of more than four letters, so also testing for word boundary might be necessary
"\b[A-Z]{3,4}\s\d[A-Z]\d{4}"
This should match your latest requirements.
BTW - if you want to pass a new line from the immediate pane for testing, try something like this
?MyFunction("This is the first part " & vbcrlf & " the next part")
 Signature Roy-Vidar
EllenM - 27 Jun 2007 16:22 GMT Thanks so much for your help, particularly the link given by John. My coworker wrote a script in Excel that does what I want. It brings the data from column F into empty column G.
Sub Regex() Dim Submission_title As Variant Range("F4").Select
For Counter = 1 To 170 Submission_title = rgxExtract(ActiveCell.Value, "\w+\s\d\w\d{4}", 0) If IsNull(Submission_title) = True Then ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate Else 'MsgBox (Submission_title) ActiveCell.Offset(rowOffset:=0, columnOffset:=1).Activate ActiveCell.Value = Submission_title ActiveCell.Offset(rowOffset:=1, columnOffset:=-1).Activate End If Next Counter End Sub
I'm still going to read the posts to write a query to do the same thing. Thank you both John and Roy.
Ellen
|
|
|