MS Office Forum / Excel / Worksheet Functions / July 2007
Counting unique items based on date
|
|
Thread rating:  |
DKS - 20 Jul 2007 11:52 GMT I have a sheet, with each row containing one record. Column A contains a date, and column B contains a department. The contents of the column A (date) are either a simple date "dd/mm/yyyy" format or a timestamp e.g. "dd/mm/yyyy hh:mm:ss". Duplicates are allowed in Column A, Column B, Column A + Column B.
I needed (ideally) a worksheet function that would allow me to obtain the following information:
Based on a START DATE, END DATE and DEPT NAME return to me the number of times I have the DEPT NAME occuring in my data for between the START DATE and END DATE (both dates included) but counting multiple rows for the same date as only 1 record. Thus if I have 27 rows for the same department with date July 15, then it should give me only the value 1.
The time-part in the timestamp format can be ignored. It exists because of input data coming from various sources and some sources store time and some do not store time.
How could I do the above with a worksheet function?
Many thanks.
Don Guillett - 20 Jul 2007 15:21 GMT try this approach =sumproduct((a4:a22>b1)*(a4:a22<b2)*(b4:b22="deptname"))
 Signature Don Guillett Microsoft MVP Excel SalesAid Software dguillett1@austin.rr.com
>I have a sheet, with each row containing one record. Column A contains a > date, and column B contains a department. The contents of the column A [quoted text clipped - 24 lines] > > Many thanks. Ron Coderre - 20 Jul 2007 15:32 GMT Try something like this:
With this structure in A1:A16 Date Dept 1/1/2007 A 1/1/2007 A 1/1/2007 A 1/1/2007 B 1/1/2007 C 1/1/2007 C 1/1/2007 C 1/15/2007 A 1/15/2007 B 1/15/2007 C 1/15/2007 C 2/1/2007 A 2/1/2007 A 2/1/2007 B 2/1/2007 B
And... E1: (StartDate eg: 01/01/2007) F1: (EndDate eg: 01/31/2007) G1: (DeptName eg: A)
Then this formula returns the number of unique dates in that range where the DeptName matches the name in G1: =SUMPRODUCT(--ISNUMBER(MATCH(ROW(INDEX(A:A,E1):INDEX(A:A,F1)),(B2:B20=G1)*(A2:A20>=E1)*(A2:A20<=F1)*A2:A20,0)))
Using the above example, the formula returns: 2 Dept A appears at least once with 1/1/2007 and 1/15/2007
Is that something you can work with? *********** Regards, Ron
XL2003, WinXP
> I have a sheet, with each row containing one record. Column A contains a > date, and column B contains a department. The contents of the column A [quoted text clipped - 18 lines] > > Many thanks. DKS - 20 Jul 2007 19:20 GMT Hi Don, Ron
Thanks for your suggestions. I tried them but they did not work. I forgot to mention that the data may not always be sorted by date. I hope that does not affect your logic?
To give you extra info on what went wrong: the formula of Don gave me an extremely high number. For example: I had data for one calendar year but the result of Don's formula was more than 366. This is not possible because there are only 366 unique dates in a year. The formula of Ron returned zero as a result.
Thanks.
> Try something like this: > [quoted text clipped - 57 lines] > > > > Many thanks. Ron Coderre - 20 Jul 2007 19:38 GMT I tried scrambling the dates and DeptNames and I still get correct (to me) results.
Try this: Play with my posted example and see if it gives correct results under those conditions. -If yes ...Try putting some of your data into the example and test again. -If No, then the data is the problem and you'll need to post some sample data so we can see what you're dealing with?
Either way, please update us on your progress.
*********** Regards, Ron
XL2003, WinXP
> Hi Don, Ron > [quoted text clipped - 71 lines] > > > > > > Many thanks. Don Guillett - 20 Jul 2007 19:51 GMT I didn't see that you only wanted UNIQUE dates.
 Signature Don Guillett Microsoft MVP Excel SalesAid Software dguillett1@austin.rr.com
>I tried scrambling the dates and DeptNames and I still get correct (to me) > results. [quoted text clipped - 107 lines] >> > > >> > > Many thanks. Ron Coderre - 20 Jul 2007 21:02 GMT Not me, Don (I don't have a preference)....but DKS wants unique dates. :)
(BTW......Congratulations on the MVP award)
*********** Regards, Ron
XL2003, WinXP
> I didn't see that you only wanted UNIQUE dates. > [quoted text clipped - 109 lines] > >> > > > >> > > Many thanks. Don Guillett - 20 Jul 2007 23:59 GMT I was saying that yours was the way to go and "Gracias Senor" for the kind words.
 Signature Don Guillett Microsoft MVP Excel SalesAid Software dguillett1@austin.rr.com
> Not me, Don (I don't have a preference)....but DKS wants unique dates. :) > [quoted text clipped - 131 lines] >> >> > > >> >> > > Many thanks. T. Valko - 21 Jul 2007 04:34 GMT Congrats, Don!
I has assumed you were already but didn't make it public.
 Signature Biff Microsoft Excel MVP
>I was saying that yours was the way to go and "Gracias Senor" for the kind >words. [quoted text clipped - 137 lines] >>> >> > > >>> >> > > Many thanks. Don Guillett - 21 Jul 2007 13:11 GMT Thanks again. To you too.
 Signature Don Guillett Microsoft MVP Excel SalesAid Software dguillett1@austin.rr.com
> Congrats, Don! > [quoted text clipped - 143 lines] >>>> >> > > >>>> >> > > Many thanks. orchid11652 - 24 Jul 2007 21:24 GMT Hi - o.k. - I can't help asking because your formula is so close to something I need. See my pasted data below:
Event CodePlex Service ID Dates Scheduled* EART309 SHO SH2 6/7/2006, 1/5/2007 EART309 SHO SHO 3/5/2006, 3/6/2006, 3/7/2006, 3/8/2006, 3/9/2006, 3/10/2006, 6/7/2006
EART309 WOM WOM 12/9/2006 EART312 SHO SH2 6/7/2006, 1/6/2007 EART312 SHO SHO 3/26/2006, 3/27/2006, 3/28/2006, 3/29/2006, 3/30/2006, 3/31/2006, 1/6/2007
EART312 WOM WOM 12/12/2006
Some of the data is wrapped so I'll explain the format. There are 4 fields - an Event, Plex, Service ID and Date. Above is the exact excel export of the data from an auxiliary system. The dates are populated into a single excel field, however, I can parse them to populate a single date per field if it makes the formula easier. What I need is a formula that totals all the unique dates for each Event Code/Plex combination. The challenge is that there can be multiple rows with the same Event Code and Plex, each which may have the same date(s) which may only be counted once. In the above sample data, the formula should return a quantity of 8 for the combination of Event Code/Plex "EART309/SHO" and 8 for "EART312/SHO". I thought maybe an Access DB would work better since I need to download the date data and create ITD reports on a regular basis by Event. Any advice on if a formula or Access would be better would be appreciated as well.
Thanks.
Orchid11652
> dguillett1@austin.rr.com > > Congrats, Don! [quoted text clipped - 144 lines] > >>>> >> > > > >>>> >> > > Many thanks. Ron Coderre - 25 Jul 2007 00:00 GMT I think I came up with something that works......
With your posted sample data in A1:D7 (I'll break it up to avoid Text Wrap): A1:C7 contains: Event CodePlex Service ID EART309 SHO SH2 EART309 SHO SHO EART309 WOM WOM EART312 SHO SH2 EART312 SHO SHO EART312 WOM WOM
D1:D7 contains: Dates Scheduled* 6/7/2006, 1/5/2007 3/5/2006, 3/6/2006, 3/7/2006, 3/8/2006, 3/9/2006, 3/10/2006, 6/7/2006 12/9/2006 6/7/2006, 1/6/2007 3/26/2006, 3/27/2006, 3/28/2006, 3/29/2006, 3/30/2006, 3/31/2006, 1/6/2007 12/12/2006
AND.... E1: (StartDate....of the range of dates to be searched, eg 3/1/2006) F1: (EndDate....of the range of dates to be searched, eg 12/31/2007)
Then....try this:
G1: (an event, eg EART309) H1: (a codeplex, eg SHO)
This ARRAY FORMULA returns the count of unique dates within the start/end range for the Event/CodePlex combination in G1:H1 I1: =SUM(--NOT(ISNA(MATCH("*"&TEXT(ROW(INDEX($A:$A,$E$1):INDEX($A:$A,$F$1)),"m/d/yyyy")&"*",IF($A$2:$A$7&$B$2:$B$7=G1&H1,$D$2:$D$7),0))))
Note: For array formulas, [Ctrl] [Shift] and press [Enter], instead of just pressing [Enter]. With sample tests, these are the results: EART309, SHO.........Returns 8 EART309, WOM........Returns 1 EART312, SHO.........Returns 8
Is that something you can work with? *********** Regards, Ron
XL2003, WinXP
> Hi - o.k. - I can't help asking because your formula is so close to > something I need. See my pasted data below: [quoted text clipped - 177 lines] > > >>>> >> > > > > >>>> >> > > Many thanks. orchid11652 - 25 Jul 2007 02:16 GMT Thanks! I will try it and let you know.
> I think I came up with something that works...... > [quoted text clipped - 228 lines] > > > >>>> >> > > > > > >>>> >> > > Many thanks. orchid11652 - 25 Jul 2007 21:54 GMT Hi - I tried your example but I'm having trouble replicating the result for 'EART309,WOM' or the other combination with only a single date. It is giving me a zero. If I populate an additional date, it gives the correct count of 2. I will work with it and see if I set up the format correctly. Thanks! This was very helpful.
> I think I came up with something that works...... > [quoted text clipped - 228 lines] > > > >>>> >> > > > > > >>>> >> > > Many thanks. Ron Coderre - 25 Jul 2007 22:08 GMT The formula I posted searches for TEXT values. Consequently, when the "Dates Scheduled*" field only has one "date" in it, you must make sure the value is TEXT, not an actual date.
You can do that by either: •prepending an apostrophe in front of the date (eg '03/01/2007 ) or •Setting the numeric format of the cells to Text before entering data. <format><cells><number tab>....Category: Text
Does that help? *********** Regards, Ron
XL2003, WinXP
> Hi - I tried your example but I'm having trouble replicating the result for > 'EART309,WOM' or the other combination with only a single date. It is giving [quoted text clipped - 234 lines] > > > > >>>> >> > > > > > > >>>> >> > > Many thanks. DKS - 21 Jul 2007 20:52 GMT Ron,
Your code worked. I had made a mistake whilst typing the formula you gave. Instead of typing A:A as you have mentioned, I inadvertently typed A2:A25000 to signify the full range. And apparently with that it did not work.
But now I have corrected the formula and it seems to be working like charm. I still have to test with time-stamps but I believe it should work.
thanks for your help.
> I tried scrambling the dates and DeptNames and I still get correct (to me) > results. [quoted text clipped - 90 lines] > > > > > > > > Many thanks. DKS - 21 Jul 2007 21:02 GMT First test with timestamp included in the date, and the date being one of the extremes being tested and the formula does not work.
For example: in your test-data hereunder if I put only one record for dept A for date March 15; and if I test for Jan 1st till Marc 15th for dept A, I get one number less than the actual number.
Just FYI.
> Ron, > [quoted text clipped - 101 lines] > > > > > > > > > > Many thanks. Ron Coderre - 22 Jul 2007 03:08 GMT Wait a minute......timestamp?!? You may not think so, but that was an important detail to leave out.
Here's why:
Excel treats dates as whole numbers and times as decimal fractions of a day. 15-MAR-2007 is 39,156 to Excel Noon on 15-MAR-2007 is 39156.5
We've been matching dates. So, when you enter something like 03/15/2007 12:00:00, that value is greater than the tested end point of 03/15/2007.
To compensate for time values, try this formula: =SUMPRODUCT(--ISNUMBER(MATCH(ROW(INDEX(A:A,E1):INDEX(A:A,F1)),INDEX(INT((B2:B20=G1)*(A2:A20>=E1)*(INT(A2:A20)<=F1)*A2:A20),0),0)))
Does that help? *********** Regards, Ron
XL2003, WinXP
> First test with timestamp included in the date, and the date being one of the > extremes being tested and the formula does not work. [quoted text clipped - 110 lines] > > > > > > > > > > > > Many thanks. Ron Coderre - 22 Jul 2007 17:58 GMT A somewhat shorter version implements the FREQUENCY function. My only reluctance in using it is that it can be a bit confusing to figure out.
With the actual data in A2:B20, Col_A contains dates, Col_B contains DeptNames, and... E1: (StartDate) F1: (EndDate) G1: (a dept name)
Here are both versions: NON-array formula =SUMPRODUCT(--ISNUMBER(MATCH(ROW(INDEX(A:A,E1):INDEX(A:A,F1)),INDEX(INT((B2:B20=G1)*(A2:A20>=E1)*(INT(A2:A20)<=F1)*A2:A20),0),0)))
ARRAY formula (committed with Ctrl+Shift+Enter, instead of Enter) =SUMPRODUCT(--ISNUMBER(MATCH(ROW(INDEX(A:A,E1):INDEX(A:A,F1)),INT((B2:B20=G1)*(A2:A20>=E1)*(INT(A2:A20)<=F1)*A2:A20),0)))
NON-array formula, using FREQUENCY =SUMPRODUCT(--((B2:B21=G1)*(FREQUENCY((B2:B20=G1)*(A2:A20>=E1)*(INT(A2:A20)<=F1)*INT(A2:A20),INT(A2:A20))>0)))
Note the first part: (B2:B21=G1) extends 1 row below the actual data area to accommodate the way the FREQUENCY function constructs its bins.
Does that help? *********** Regards, Ron
XL2003, WinXP
> Wait a minute......timestamp?!? > You may not think so, but that was an important detail to leave out. [quoted text clipped - 132 lines] > > > > > > > > > > > > > > Many thanks. Teethless mama - 21 Jul 2007 19:10 GMT Try this:
=SUM(IF(FREQUENCY(IF((Dept="A")*(TEXT(Date,"mmmyy")="Jan07"),INT(Date)),INT(Date))>0,1))
> I have a sheet, with each row containing one record. Column A contains a > date, and column B contains a department. The contents of the column A [quoted text clipped - 18 lines] > > Many thanks.
|
|
|