MS Access Forum / Database Design / January 2005
HOW DO I FIND NUMBER OF OCCURENCES FOR DIFFERENT LETTERS IN A FIEL
|
|
Thread rating:  |
jagadeeshtiru@hotmail.com - 29 Jan 2005 06:55 GMT WHILE I AM CREATING A DATABASE IN ACCESS, I NEED TO COUNT THE NUMBER OF REPEATED LETTERS FOR EACH ROW. THE LETTER COUNT SHOULD BE PRINT IN SEPERATE FIELDS FOR EACH DIFFERENT LETTER.
I HAVE 12500 ROWS IN FIELD, CONTAINING A SEQUENCE OF LETTERS OF "A", "T","G","C" WITH LENGTH <=600. NOW I NEED TO COUNT THE NUMBER OF A's, T's, G's AND C's PRESENT IN EACH ROW.
THE COUNT MUST BE INTO A NEW FIELD.
CAN YOU SOLVE MY PROBLEM?
PLEASE...
THANKING YOU
John Vinson - 29 Jan 2005 07:10 GMT >WHILE I AM CREATING A DATABASE IN ACCESS, >I NEED TO COUNT THE NUMBER OF REPEATED LETTERS FOR EACH ROW. [quoted text clipped - 11 lines] > >THANKING YOU First off... please turn off your CAPS LOCK key. It's hard to read, impolite, and looks like you are SHOUTING AT US.
Secondly - this must obviously be a Memo field since text fields are limited to 255 bytes, right? DNA sequences it seems...
You will need some VBA code to do this. Here's some untested code, but it should work.
Public Function CountLetter(strSeq as String, strBase as String) _ As Integer Dim iPos As Integer CountLetter = 0 For iPos = 1 to Len(strSeq) If Mid(strSeq, iPos, 1) = strBase Then CountLetter = CountLetter + 1 End If Next iPos End Sub
In a Query in four vacant Field cells, assuming the memo field is named Seq, type
CountA: CountLetter([Seq], "A") CountT: CountLetter([Seq], "T")
and so on.
Don't expect this to be blazing fast on a large table!
John W. Vinson[MVP]
John Nurick - 29 Jan 2005 16:50 GMT >Public Function CountLetter(strSeq as String, strBase as String) _ > As Integer [quoted text clipped - 16 lines] > >Don't expect this to be blazing fast on a large table! Hi John,
Out of curiosity I wrote a similar function using Static variables to avoid iterating the string every time the function is called. I timed update queries using the two functions, running them against a table containing 100,000 random sequences, i.e. 400,000 function calls. The sequences were of random lengths between 300 and 599 characters.
Results of three runs of each query suggest that the Static approach is usefully faster. These are hh:mm:ss: Vinson: 00:05:22 Nurick: 00:02:35 Vinson: 00:05:20 Nurick: 00:02:34 Vinson: 00:05:19 Nurick: 00:02:37
Here's the function I used:
Public Function AllAtOnce(Sequence As String, Base As String) As Long Static S As String Static ACount As Long Static CCount As Long Static GCount As Long Static TCount As Long Dim j As Long If Sequence <> S Then 'perform a count S = Sequence ACount = 0 CCount = 0 GCount = 0 TCount = 0 For j = 1 To Len(S) Select Case Mid(S, j, 1) Case "A": ACount = ACount + 1 Case "C": CCount = CCount + 1 Case "G": GCount = GCount + 1 Case "T": TCount = TCount + 1 End Select Next End If Select Case Base Case "A": AllAtOnce = ACount Case "C": AllAtOnce = CCount Case "G": AllAtOnce = GCount Case "T": AllAtOnce = TCount End Select End Function
-- John Nurick [Microsoft Access MVP]
Please respond in the newgroup and not by email.
Brendan Reynolds - 29 Jan 2005 18:56 GMT See the thread "Count occurances of specific character in string" in the microsoft.public.access.queries newsgroup
 Signature Brendan Reynolds (MVP)
> WHILE I AM CREATING A DATABASE IN ACCESS, > I NEED TO COUNT THE NUMBER OF REPEATED LETTERS FOR EACH ROW. [quoted text clipped - 13 lines] > > THANKING YOU John Nurick - 29 Jan 2005 22:12 GMT >See the thread "Count occurances of specific character in string" in the >microsoft.public.access.queries newsgroup Neat. This query
UPDATE tblMain SET ACount = Len([Sequence])-Len(Replace([Sequence],"A","")), CCount = Len([Sequence])-Len(Replace([Sequence],"C","")), GCount = Len([Sequence])-Len(Replace([Sequence],"G","")), TCount = Len([Sequence])-Len(Replace([Sequence],"T","")) ;
updated the 100,000 records in about 1'34", a minute faster than my "static" function.
-- John Nurick [Microsoft Access MVP]
Please respond in the newgroup and not by email.
John Vinson - 30 Jan 2005 00:28 GMT >UPDATE tblMain >SET [quoted text clipped - 6 lines] >updated the 100,000 records in about 1'34", a minute faster than my >"static" function. wow. Never would have thought of That one. "Me 'at's off to the Dook!"
John W. Vinson[MVP]
John Nurick - 30 Jan 2005 08:16 GMT >>UPDATE tblMain >>SET [quoted text clipped - 8 lines] > >wow. Never would have thought of That one. "Me 'at's off to the Dook!" Hats off, definitely, but I'm kicking myself for not having thought of it. In Perl I'd probably have done $Acount = s/A/A/g; $Ccount = s/C/C/g; ... which replaces each character with itself and counts the number of replacements, and from that it's only a small step to TK's solution.
-- John Nurick [Microsoft Access MVP]
Please respond in the newgroup and not by email.
Graham R Seach - 30 Jan 2005 12:46 GMT Just to add a dimension to this thread, this is what I use (in VBA): CountOccurrences = (Len(sSource) - Len(Replace(sSource, sFind, ""))) / Len(sFind)
Regards, Graham R Seach Microsoft Access MVP Sydney, Australia ---------------------------
>>See the thread "Count occurances of specific character in string" in the >>microsoft.public.access.queries newsgroup [quoted text clipped - 16 lines] > > Please respond in the newgroup and not by email. Douglas J. Steele - 30 Jan 2005 15:49 GMT Documented in my April, 2004 Access Answers column in Smart Access http://members.rogers.com/douglas.j.steele/SmartAccess.html although I use \ Len(sFind) instead of / Len(sFind).
As I state in the article, I got it from a post Mike Sutton (a VB MVP) made some time back.
 Signature Doug Steele, Microsoft Access MVP http://I.Am/DougSteele (no e-mails, please!)
> Just to add a dimension to this thread, this is what I use (in VBA): > CountOccurrences = (Len(sSource) - Len(Replace(sSource, sFind, ""))) / [quoted text clipped - 26 lines] >> >> Please respond in the newgroup and not by email. Graham R Seach - 30 Jan 2005 22:33 GMT I don't read Smart Access. Perhaps I should. It seems it would save me the trouble of working out stuff for myself.
Regards, Graham R Seach Microsoft Access MVP Sydney, Australia
Microsoft Access 2003 VBA Programmer's Reference http://www.wiley.com/WileyCDA/WileyTitle/productCd-0764559036.html
> Documented in my April, 2004 Access Answers column in Smart Access > http://members.rogers.com/douglas.j.steele/SmartAccess.html although I use [quoted text clipped - 33 lines] >>> >>> Please respond in the newgroup and not by email.
|
|
|