Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsFormsForms ProgrammingQueriesModules / DAO / VBAReports / PrintingMacrosDatabase DesignSecurityConversionImporting / LinkingSQL Server / ADPMultiuser / NetworkingReplicationSetup / ConfigurationDeveloper ToolkitsActiveX ControlsNew UsersGeneral 1General 2
Access DirectoryToolsTutorialsUser Groups
Related Topics
SQL ServerOther DB ProductsMS OfficeMore Topics ...

MS Access Forum / Queries / November 2005

Tip: Looking for answers? Try searching our database.

Crosstab Pivot column problem

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Jt3mitch1 - 18 Nov 2005 18:47 GMT
I've been racking my brains to figure out the best way to approach my problem.

I have test results I need to organize:

ID          date          labnumber(unique)     result
1           10/01/01        1                          15.0
1           10/01/01        2                           7.0
1           11/01/01        3                           0.0
1           11/01/01        4                           0.0
1           11/01/01        5                           7.0
2           12/01/02        6                          15.0
2           12/01/02        7                           1.0
2           01/01/03         8                          1.0
2            01/01/03        9                           2.5

I'm trying to use a crosstab query to convert data to:

ID          date                 rank1                  rank2                
rank3
1          10/01/01              7.0                   15.0
1          11/01/01              0.0                     0.0                
    7.0
2          12/01/02             1.0                     15.0
2           01/01/03             1.0                    2.5

I'll have up to 40 results for each ID.  Too many different results in the
database to use results as column headings. I need to figure out how to rank
the results and have this rank as the column heading.

I'm bad at VBA, so after checking through newsgroups, I saw a post by Steve
Dassin that looked promising. This is a general idea of what I'm trying to
adjust to meet my needs:

TRANSFORM Max(result) AS data
SELECT ID, date,
DCount("*","query1","(" & "[ID]='"&ID&"'" &")" & "AND" &"(" &
"[result]<"&result &
"OR" & "(" & "[result]="&result & "AND" & "[labnumber]<"&labnumber & ")" &
")" ) + 1 AS alias,
Count(data) AS cnt,
FROM query1
GROUP BY ID, date,
PIVOT
DCount("*","query1","(" & "[ID]='"&ID&"'" &")" & "AND" &"(" &
"[result]<"&result &
"OR" & "(" & "[result]="&result & "AND" & "[labnumber]<"&labnumber & ")" &
")" ) + 1;

Thanks for any assistance,

JT
Tom Ellison - 19 Nov 2005 00:03 GMT
Dear JT:

Since you only want the top 2 unique values of [result] for each ID, you can
eliminate duplications:

SELECT DISTINCT ID, [date], result
 FROM query1

Make this query2.

Building up from this, you only care about the top 2 for each ID:

SELECT ID, [date], result
 FROM query2 Q
 WHERE (SELECT COUNT(*) FROM query2 QA
   WHERE QA.ID = Q.ID AND QA.result > Q.result) < 2

I'm expecting this to give you:

ID        date        result
1          10/01/01  15.0
1          10/01/01    7.0
2          12/01/02  15.0
2          01/01/03    2.5

Make this query3.

Next, we can add a rank column and then crosstab this, or we can just
subquery it (replacing the crosstab).

However, before continuing, please consider what you'd want from this data:

ID          date          labnumber(unique)     result
1           10/01/01        1                          15.0
1           10/01/01        2                           7.0
1           11/01/01        3                           0.0
1           11/01/01        4                           0.0
1           11/02/01        5                           7.0

All I've done is change the date of the last row.  Now, when you report the
result = 7 in the Rank2 column, what date do you want to show for it?

Finally, ignoring the above problem for the moment, you can produce your
final result:

SELECT DISTINCT ID, date,
   (SELECT MAX(result)
     FROM query3 Q3A
     WHERE Q3A.ID = Q3.ID
     GROUP BY ID) Rank1,
   (SELECT MIN(result)
     FROM query3 Q3A
     WHERE Q3A.ID = Q3.ID
     GROUP BY ID) Rank2
 FROM query3 Q3

Please let me know if this helps, and whether I can be of any further
assistance.

Tom Ellison

> I've been racking my brains to figure out the best way to approach my
> problem.
[quoted text clipped - 50 lines]
>
> JT
Jt3mitch1 - 19 Nov 2005 06:40 GMT
Tom,
Thanks for taking a look at my problem.

I'm basically trying to convert rows in a query (specifically results field)
to columns (based on id by date as rows).  My attempt at using crosstabs
includes a  lack of access knowledge.

I'll want all the results, not just the min and max to be listed in the
columns (1stresult, 2ndresult, 3rdresult....40thresult).  I thought I could
use rank as the column headings, but now realize that having more than one
result with the same value breaks the rank idea.

Each row in my query1 is unique by labnumber

Id    Date    Labnum    Result
1    10/01/01    1    15.0
1    10/01/01    2    7.0
1    11/01/01    3    0.0
1    11/01/01    4    0.0
1    11/01/01    5    7.0
2    12/01/02    6    15.0
2    12/01/02    7    1.0
2    01/01/03    8    1.0
2    01/01/03    9    2.5

For Query2, I originally thought I could rank the results for each id and
date from lowest to hightest (up to 40 results for an ID on any given date):

ID    date         labnum    result    Rank
1    10/1/2001    2    7    1
1    10/1/2001    1    15    2
1    11/1/2001    4    0    1
1    11/1/2001    3    0    2
1    11/1/2001    5    7    3     <3 results for ID#1 on 11/1/2001
2    12/1/2002    7    1    1
2    12/1/2002    6    15    2
2    1/1/2003    8    1    1
2    1/1/2003    9    2    2

Query3 would convert the rank values to column headings and I'd list the
count of results per ID and date:

ID    date           Count    Rank1 Rank2  Rank3    up to . . Rank 40
1    10/1/2001    2    7    15       
1    11/1/2001    3    0    0          7   
2    12/1/2002    2    1    15   
2    1/1/2003    2    1    2   

My ultimate goal was to get the count of results for each id on a given
date.  Then  I would identify which result was the 90th percentile based on
the count. 90th percentile="rank" & roundup(0.9*count)

I'm sure this is the long way around, but I couldn't figure out how to get
90th percentile (for each id by date) straight from query1.  

I don't know if I've made it more clear or more complicated.

Thanks again,

JT

>"Tom Ellison" wrote:
> Dear JT:
[quoted text clipped - 57 lines]
>
> Tom Ellison
Tom Ellison - 20 Nov 2005 03:58 GMT
Dear JT:

It is not necessarily the case that you cannot rank within multiple values.
If the values are A, B, B, and C, the rank can be 1, 2, 2, 4.  Your could
also ignore duplicates, resulting in values A, B, and C having ranks 1, 2,
and 3.  Either is readily possible.

Are you wanting to find the 90th percentile of all the results in a set
which is a single Id, or a combination of Id and date?  Do you want the
query to simply provide this result, or do you want up to 40 Ranks
displayed?

I don't see part of what you show as your final (Query 3) results.  I
understand the ID, date, and Count columns.  I understand the 7 being in the
Rank1 column.  But what are the two zeros just before that?

I'm guessing a solution might be to have a 2 column "Crosstab Cell."  In
this "cell" you could see both the Result value and a count of how many of
those there are.  When there are multiple Results at the same rank, you
would see this information in that "cell".  While this cannot be done in a
crosstab (it limits you to a single value in each cell), it can be done
using another technique.

If this is desirable, then it must be the case that you do not want to
ignore duplicates, and you want to group by ID and date together.

So, there are several avenues along which we can proceed.

- ignore duplicates or not
- show all ranks or just return the 90th percentile

Please let me know how to proceed.  Please also create a new sample data set
that can be used to illustrate and test this more complex set of
requirements.

Tom Ellison

> Tom,
> Thanks for taking a look at my problem.
[quoted text clipped - 125 lines]
>>
>> Tom Ellison
Jt3mitch1 - 21 Nov 2005 04:50 GMT
Tom,

My use of the term "rank" probably confuses the issue.  I was using it to
identify the results listed in ascending order (for each id and date
combination), with the lowest at the lowest "rank1" and the highest at the
highest "rankXX", while including duplicates.  I got the 0 results from ID#1
on 11/01/01 and had the first 0 listed as rank 1 and the second 0 listed as
rank2.

Because I already have the results individually printed in a report (sorted
by ID and date), I could drop the ranking issue and focus on just returning
the 90th percentile (for each id and date combination) from the original
query.

Each row in my query1 is unique by labnumber

Id_____Date_____ Labnum_____Result
1_____10/01/01_____1_____12.0
1_____10/01/01_____2_____15.0
1_____10/01/01_____3_____10.0
1_____10/01/01_____4_____9.0
1_____10/01/01_____5_____6.0
1_____10/01/01_____9_____8.0
1_____10/01/01_____10_____4.0
1_____10/01/01_____11_____3.0
1_____10/01/01_____12_____0.0
1_____10/01/01_____13_____0.0
1_____11/01/01_____14_____22.0
1_____11/01/01_____15_____17.0
1_____11/01/01_____16_____14.0
1_____11/01/01_____17_____16.0
1_____11/01/01_____18_____10.0
1_____11/01/01_____19_____9.0
1_____11/01/01_____20_____9.0
1_____11/01/01_____30_____8.0
1_____11/01/01_____31_____4.0
1_____11/01/01_____32_____0.0
1_____11/01/01_____33_____0.0
2_____01/01/03_____34_____20.0
2_____01/01/03_____35_____18.5
2_____01/01/03_____36_____17.5
2_____01/01/03_____38_____16.5
2_____01/01/03_____39_____5.5
2_____01/01/03_____40_____3.5
2_____01/01/03_____42_____4.5
2_____01/01/03_____43_____2.5
2_____01/01/03_____44_____1.0
2_____01/01/03_____45_____1.0

The final query would include a count of results for each id/date combination.
The 90th percentile result would be a specific result (roundup(0.9*count))
starting from lowest to highest, ascending.  I round it because I may have
odd result counts.

ID_____date______count_____90th percentile result
1_____10/01/01_____10_____12.0  <-roundup (0.9*10)=9th result
1_____11/01/01_____11_____17.0  <-roundup (0.9*11)=10th result
2_____01/01/03_____10_____18.5  <-roundup (0.9*10)=9th result

I imagine I would have query2 give the count and calculate which result
would be the 90th percentile.

SELECT query1.ID, query1.date, round(0.9*Count([result])) AS 90th
FROM query1
GROUP BY query1.ID, query1.date;

A query3 would then pull the 90th percentile result from the original
query1.  All I need is to figure out how to query the "9th highest" or "10th
highest" result (based on query2 for each Id/date combination).  I think I'll
have to rank them as you suggested with multiple values ranked (eg.
A,B,B,C..) as (1,2,2,4..)

Getting close, thanks again,
JT

> Dear JT:
>
[quoted text clipped - 32 lines]
>
> Tom Ellison
Tom Ellison - 21 Nov 2005 05:28 GMT
Dear JT:

I believe it will be possible to calculate the 90th percentile rank number
as you suggest.  Now there is a problem.  Let's say there are 12 results.
The 90th percentile rank would then be the 11th highest.  However, there may
not be an 11th highest.  If there is a two way tie for 10th place, then
there will be no 11th place.  However, there WILL be 2 10th place results.
Now, while it is possible to return just the Result value of the 10th place
result in this case, it will not be possible to return the Date and Labnum
of this.

It would be possible to use Labnum as a secondary ranking value.  Being
unique, there would then never be any ties.

I'm not really quite up to the task at the moment, but if you'll carefully
consider this and reply I'll try to tackle it in the morning when I'm fresh.
Please specifically answer whether you need only the Result value, or do you
need Date and/or Labnum.  If you look at this carefully, choosing between
two tied rows with the same Result value would seem to be a rather arbitrary
thing to do.  So, given that you have duplicate Result values, showing which
Labnum and Date may not be realistic, since you must arbitrarily choose one
when there is a tie.

Tom Ellison

> Tom,
>
[quoted text clipped - 120 lines]
>>
>> Tom Ellison
Jt3mitch1 - 21 Nov 2005 21:16 GMT
Tom,

If the 90th percentile was the 11th result and we had two results ranked
10th, I would return the 10th ranked value.  

I include labnumber in the query1 only to have a unique value for each
record.  The labnumbers won't be in the same order as the results, so I'm not
sure they could be used for a secondary rank.   In the end, I wouldn't return
the labnumber, I only need the 90thpercentile value for each id/date
combination.  

Thanks again for the extra effort,
JT

> Dear JT:
>
[quoted text clipped - 20 lines]
>
> Tom Ellison
Tom Ellison - 21 Nov 2005 22:43 GMT
Dear JT:

Since you don't require the query to return any other specific information
from the row that is the 90th percentile, then we don't have to worry about
the possibility that there could be more than one row that is tied for the
same value or Result.

Now, if you rank all the rows in each group by ascending values for Result
you can then get the maximum rank value that is less than or equal to the
total number of rows in the group multiplied by 0.9.  Once you have this
rank value, you can obtain the row(s) that rank that way and return the
Result value in that set of rows.  Since all the rows with the same rank
have the same Result value, you can use the MIN(Result) or the MAX(Result)
of all these, since those will all be the same.

OK, that's probably easier said than done.  So, here goes an attempt at
coding all that.

I'm going to make this in pieces, each of which needs to be a saved query.
This allows you to test each piece and make sure it works correctly.  It
also avoids problems where the Jet database really has very limited
capabilities to perform subqueries.

Query3:

SELECT ID, [date], labnumber, result,
   (SELECT COUNT(*) + 1 AS Rank
     FROM query1 Q1A
     WHERE Q1A.ID = Q1.ID
       AND Q1A.[date] = Q1.[date]
       AND Q1A.result < Q1.result) AS Rank
 FROM query1 Q1
 GROUP BY ID, [date]

This should rank the rows within each ID/date group.  This rank value will
be used to find the 90th percentile.

Query4:

SELECT ID, [date],
   MAX(Rank) AS Ninetieth,
 FROM query3 Q3
 WHERE Rank <= 0.9 *
   (SELECT COUNT(*)
     FROM query3 Q3A
     WHERE Q3A.ID = Q3.ID
       AND Q3A.[date] = Q3.[date])
 GROUP BY ID, [date]

One note on this on:  if there is only one row for an ID/date combination,
then this cannot work.  The Rank of that one row would be 1, and there are
then no rows whose rank is less than or equal to 0.9 * 1.

Query5:

SELECT ID, [date], MIN(result) AS result
 FROM Query3 Q3
   INNER JOIN Query4 Q4
     ON Q4.ID= Q3.ID AND Q4.[date] = Q3.[date]
       AND Q4.Ninetieth = Q3.Rank
 GROUP BY ID, [date]

Perhaps, if I've done this carefully enough, this may work and be what you
need.  Please test each query and see what it does, confirming its results.

Just remember, I'm creating progressive program steps here without being
able to test any of them.

Please let me know if this helped and if I can be of any other assistance.

Tom Ellison

> ID          date          labnumber(unique)     result
> 1           10/01/01        1                          15.0
[quoted text clipped - 6 lines]
> 2           01/01/03         8                          1.0
> 2            01/01/03        9                           2.5

Tom Ellison

> Tom,
>
[quoted text clipped - 47 lines]
>>
>> Tom Ellison
Jt3mitch1 - 22 Nov 2005 18:51 GMT
Tom,
Query3 worked fine.  I'm having problems with Query4.

Query3:
SELECT Q1.id, Q1.date, Q1.labnumber, Q1.result, (SELECT COUNT(*) + 1 FROM
query1 Q1A WHERE Q1A.ID = Q1.ID AND Q1A.[date] = Q1.[date] AND Q1A.result <
Q1.result) AS Rank
FROM query1 AS Q1;

Query4:
SELECT Q3.id, Q3.date, Max(Q3.Rank) AS Ninetieth
FROM query3 AS Q3
WHERE (((Q3.Rank)<=0.9*(SELECT COUNT(*) FROM query3 Q3A  WHERE Q3A.ID =
Q3.ID AND Q3A.[date] = Q3.[date])))
GROUP BY Q3.id, Q3.date;

It looks like query4 should work like expected.  Access 2000 doesn't seem to
like the WHERE criteria.  I'm able to see design and SQL view, but can't run
the query.  Nothing happens when I try to open or run the query.  
When I close query4 and try to close another query or table, it gives me a
"this action will reset the current code" error message and I have to close
the objects in design view.  I tried creating a test database with only 1
table and the 3 queries, but had the same problem.

Thanks again,
JT

> Dear JT:
>
[quoted text clipped - 67 lines]
>
> Tom Ellison
Tom Ellison - 22 Nov 2005 20:25 GMT
Dear JT:

I'm looking at the code, edited slightly for my convenience, below:

SELECT Q3.id, Q3.date, Max(Q3.Rank) AS Ninetieth
 FROM query3 AS Q3
 WHERE Q3.Rank <= 0.9*
   (SELECT COUNT(*)
     FROM query3 Q3A
     WHERE Q3A.ID = Q3.ID
       AND Q3A.[date] = Q3.[date])
 GROUP BY Q3.id, Q3.date;

You say this "doesn't seem to like the WHERE criteria."

Try filtering it to only a single ID.

SELECT Q3.id, Q3.date, Max(Q3.Rank) AS Ninetieth
 FROM query3 AS Q3
 WHERE Q3.Rank <= 0.9*
   (SELECT COUNT(*)
     FROM query3 Q3A
     WHERE Q3A.ID = Q3.ID
       AND Q3A.[date] = Q3.[date])
 WHERE id = 111 (put your actual number in here)
 GROUP BY Q3.id, Q3.date;

Try letting it run overnight, just to see.

It doesn't seem to me there's something wrong.  However, Access Jet has long
been very limited.  I switched to MSDE because of this.  One of my best
decisions ever.  I can practically guarantee you you would not have such
problems with MSDE.

Tom Ellison

> Tom,
> Query3 worked fine.  I'm having problems with Query4.
[quoted text clipped - 110 lines]
>>
>> Tom Ellison
Jt3mitch1 - 22 Nov 2005 21:31 GMT
Tom,
I was running the queries on a table of 31 records, so I thought it
shouldn't need to run long. Because we didn't see anything wrong with the
SQL, I decided to jump to another computer and run it there.  It worked
magnificently.  I'm not sure why it blows up on my computer, I'll probably
try and reinstall Access and hope that fixes the problem.

Thanks for all the help,
JT

> Dear JT:
>
[quoted text clipped - 31 lines]
>
> Tom Ellison
Tom Ellison - 22 Nov 2005 22:25 GMT
Dear JT:

Only "magnificently"?  Well, I guess that will just have to do.

Tom Ellison

> Tom,
> I was running the queries on a table of 31 records, so I thought it
[quoted text clipped - 42 lines]
>>
>> Tom Ellison
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.