MS Office Forum / Excel / New Users / July 2006
Excel data consolidation question
|
|
Thread rating:  |
drodysill@gmail.com - 21 Jun 2006 21:09 GMT I had a question -- I have a series of lists of users based on a userid
text string and their name, some of which are duplicates. What i've done is import the lists into excel and then do a sort by userid. Additionally, I made a note of which database they're each in, using a different column for each of the four databases. I first copied all of the entries from "users1" and then copied "users1" into every row as column C. Next, I copied the "users2" list and for those, put "user2" in column D...and so on up to users4.
Now, what I want to do is consolidate the entries, making sure there is
only ONE entry PER userid, but say if a user is in "user2" and "user3" and thus has those entries in column D and E respectively, I want to note both. Basically, I just want one row per userid and for each userid, that row should include all "user(x)" column entries that they exist in. Any ideas? I could also use access if that would be easier to
do this.
If it helps, I'm eventually going to compile all of these spreadsheets (which are seperated by computer application they have access to...and each of these "user(x)" subentries are different authentication databases within that program)...and attempt to spit out a master list that has one row per userid with all of the programs they have access to (we wouldn't need to worry about the 'user(x)' subentries at that point though, just if they exist in ANY of them).
Thanks much in advance or let me know if something isn't clear...
-Drew
aaron.kempf@gmail.com - 21 Jun 2006 22:12 GMT Excel isn't a database.
Spit on anyone that uses Excel instead of a database
it isn't the right tool for anything.
-Aaron
> I had a question -- I have a series of lists of users based on a userid > [quoted text clipped - 27 lines] > > -Drew atomiks - 22 Jun 2006 16:02 GMT Ok true...very helpful in this case. It IS however good for easily entering data that can be later compiled...which was my actual question (note use of "I could also use MS Access"). Anyone have any real ideas?
> Excel isn't a database. > [quoted text clipped - 35 lines] > > > > -Drew Harlan Grove - 22 Jun 2006 17:31 GMT drodysill@gmail.com wrote...
>I had a question -- I have a series of lists of users based on a userid >text string and their name, some of which are duplicates. What i've [quoted text clipped - 12 lines] >exist in. Any ideas? I could also use access if that would be easier to >do this. ...
So each list contains just user ID (UID) and name (UNAME) fields? And the consolidated table's col C through F would also contain the UID?
You need to start with a master list of all UIDs, which means it'd be easier to add a 3rd column to each table for the list ID (LID), so user1 would have LID 1, user2 LID 2, etc. Then append the augmented lists into a single long list with 3 columns. I'll call it SLL. Sort it in ascending order first by UID then LID.
At that point you could use Advanced Filters to extract only one instance of each UID and UNAME in SLL. Then you could add formulas to the columns to the right of the distinct UID and UNAME to indicate the UIDs' presence in each of the original lists. If the topmost UID and UNAME were in X2:Y2, try the formulas
AD2: =MATCH(X2,INDEX(SLL,0,1),0)
AE2: =COUNTIF(INDEX(SLL,AD2,1):INDEX(SLL,AD2+3,1),X2)-1
Z2: =IF(COUNTIF(INDEX(SLL,AD2,3):INDEX(SLL,AD2+AE2),COLUMNS($Z2:Z2)),X2,"")
Fill Z2 right into AA2:AC2, then select Z2:AE2 and either double click on the fill handle or fill down into the same rows as the UIDs and UNAMEs in cols X and Y.
If you could live with 1s or 0s indicating whether or not a particular UID appears in a given table, you could use SLL to generate a pivot table with UID and UNAME as the row variables and LID as the column variable with operation COUNT.
aaron.kempf@gmail.com - 22 Jun 2006 18:26 GMT Excel isn't GOOD for data entry.
it isn't GOOD for consolidating multiple spreadsheets into a single data store.
Databases are a much better long term solution.
Excel is a disease. Are you a leper?
-Aaron
> drodysill@gmail.com wrote... > >I had a question -- I have a series of lists of users based on a userid [quoted text clipped - 47 lines] > table with UID and UNAME as the row variables and LID as the column > variable with operation COUNT. Harlan Grove - 22 Jun 2006 21:45 GMT aaron.kempf@gmail.com wrote...
>Excel isn't GOOD for data entry. Only because you don't really know how to use it.
>it isn't GOOD for consolidating multiple spreadsheets into a single >data store. Again, you don't know how to use it.
>Databases are a much better long term solution. Debatable. Depends on how easy it is to access the information. The OP mentions different authentication *databases*. If they're truly database tables, then best to use whatever database that is.
aaron.kempf@gmail.com - 22 Jun 2006 21:59 GMT Harlan
you're just flat out full of sh.t.
Don't shoot the messenger-- just because I speak the truth.
A) manager has each employee fill out their hours in a spreadsheet B) each employee does this and emails it into the manager.
how does the manager import 30 spreadsheets and consolidate numbers out of all 30 workbooks?
please give me details... you are absolutely full of crap harlan.
C) what happens when someone enters 02042006 instead of 2/4/2006 in a column? Excel CHOKES on this simple type of data mismatch D) what happens when Susie; over in marketing-- wants to take vacation days. She adds a column called 'vacation hours' and emails it to her boss. Seems like a perfectly natural thing to do.
SO HOW DOES EXCEL IMPORT FROM 30 DIFFERENT WORKBOOKS THEN HARLAN?
Don't talk sh.t about something WHEN YOU'RE WRONG.
Excel is a disease; and I spit on anyone that uses Excel for anything.
Spit in your face.
There are better ways-- email someone a form in Access; it gets converted to a DAP (plain HTML); they enter all their data and presto-chango-- I am ALREADY DONE.
So let's compare your 29 steps -- with rampant points of failure-- to my much much much simpler and scalable solution.
HOW DO YOU LIKE ME NOW, PUNK?
Grow some balls and learn a real program; Harlan.
What a waste of time.
'yeah; it's more efficient to consolidate 30 spreadsheets than to keep the data in a single database'
eat sh.t harlan you spew nothing but worthless garbage.
Excel is a waste of time.. A bigger time-waster than Internet Explorer and Solitaire combined.
ANY COMPANY THAT WANTS TO THRIVE-- TO SUCCEED-- SHOULD UNINSTALL EXCEL FROM EVERY MACHINE IN THEIR COMPANY.. AND WORK TO BUILD THE INFRASTRUCTURE THAT THEY REALLY NEED.
uninstall excel from every machine at every company.
I have a free solution that is scalable and mulitple people can edit their own data at the same time.
Harlan has a piece of sh.t technology that is inherently single user; and not scalable enough to begin to meet the needs it's users.
Excel is crap.
I have a better spreadsheet than Excel; it doesn't require installation; it doesn't require copying and pasting hundreds of work books.
It has version control. It has an audit trail.
I have pivotTables that are a thousand times better than your silly XLS pivotTables.
> aaron.kempf@gmail.com wrote... > >Excel isn't GOOD for data entry. [quoted text clipped - 11 lines] > mentions different authentication *databases*. If they're truly > database tables, then best to use whatever database that is. Harlan Grove - 22 Jun 2006 22:55 GMT aaron.kempf@gmail.com wrote... ...
>how does the manager import 30 spreadsheets and consolidate numbers out >of all 30 workbooks? http://www.google.com/groups?selm=hkQVb.2432%24_4.259%40www.newsranger.com
Use the first approach.
>C) what happens when someone enters 02042006 instead of 2/4/2006 in a >column? They get a bunch of #'s rather than a date. If the spreadsheet is well-written (so something you couldn't manage), the date could validated using a formula like
=COUNT(1/(ABS(YEAR(DateEntry)-1960)<=60))
which evaluates to 1 if the year of DateEntry is within 60 years of 1960, so 1900..2020, which seems a reasonable range in 2006. Simple enough to check narrower, more recent ranges. And using this validation formula, ad hoc diagnostic messages could be implemented with formulas like
=IF(ValidDateEntry,"","Invalid date entry in cell "&CELL("Address",DateEntry))
> Excel CHOKES on this simple type of data mismatch Yes, it would if you were so incompetent not to check all entries. Any cell can contain any value. That's flexibility, but it comes at a price: it's up to the person writing the formulas that use user entries to ensure those entries are valid. If that's different from database, tough, this is just how spreadsheets work.
>D) what happens when Susie; over in marketing-- wants to take vacation >days. She adds a column called 'vacation hours' and emails it to her boss. >Seems like a perfectly natural thing to do. If the worksheet were protected, she wouldn't be able to insert anything. At that point she'd need to call he boss to ask how to include vacation hours. Of course this raises the question whether time sheets should include anything other than work hours, and if they should, why wouldn't there already be entries for vacation hours?
Feable effort creating this straw man, but this may be all you can dream up.
>There are better ways-- email someone a form in Access; it gets >converted to a DAP (plain HTML); they enter all their data and >presto-chango-- I am ALREADY DONE. And if they forward the e-mail to, say, their home e-mail account so they can fill it out in the evening, would they be able to make entries to your database from any machine with an internet connection? If so, what prevents anyone else from feeding garbage into your database?
>I have a free solution that is scalable and mulitple people can edit >their own data at the same time. Really? Where's the url to download it so anyone can see whether this claim is BS or not?
To the OP: Don't mind Aaron. He's right to suggest that Access may be better than Excel for this provided you can access the authentication lists via ODBC. But when it comes to the antispreadsheet ranting, he's just angry because he's never been able to figure out how to use them.
aaron.kempf@gmail.com - 23 Jun 2006 00:57 GMT Harlan;
So you're going to do this
=COUNT(1/(ABS(YEAR(DateEntry)-1960)<=60))
For each of a trillion different cells.
And then you'll open up each spreadsheet by hand and if it has a 1 in there; then you'll know that you've got to change this value by hand?
Are you kidding me??
Re:
> Really? Where's the url to download it so anyone can see whether this > claim is BS or not? it's just functionality that is built into Access.
File, SendTo
if worst comes to worse; you can save your form as a Data Access Page and then email it around.
I've built quite a few complex 'Excel Replacement' solutions using this type of technology.
but it looks just like a webpage; it is a simple webpage. it uses these components called 'office web components'
When I say 'components' think of something similiar to Adobe Acrobat Reader; or Flash.. all it consists of is a 'Object Tag' in HTML.
and these components are what make Excel completely and utterly OBSOLETE.
you know when you save a spreadsheet as HTML and 'add interactivity'? that is what I am talking about.
It isn't a piece of Excel-- these are primarily components that are best created using Microsoft Access (or something like dreamweaver for example).
those are the components that I'm referring to. I use those ALL DAY LONG; EVERY DAY. and they provide things like:
a) drilldown - the ability to have a drilldown effect in a pivotTable b) displaying a field; but keep it collapsed-- so that drilldown is easy c) the ability to create custom fields INSIDE the pivotTable (you have to create custom formulas OUTSIDE of a pivotTable) d) the ability to have 250,000 rows in a 'spreadsheet'
Here is a basic page to help you to get your feet wet with Access http://www.bcschools.net/staff/AccessHelp.htm
Here is a starter page that describes some funcitonality that is found within Data Access Pages http://office.microsoft.com/en-us/assistance/CH062526501033.aspx
Information about emailing Data Access Pages http://office.microsoft.com/en-us/assistance/HP030890051033.aspx
you don't need 'Access' on your machine in order to fill out these forms. All you need is a valid office license (2002 or 2003) and then one of these products:
a) Access b) Excel c) Word d) Outlook
I think that any of those 4 products counts as a license to use Office Web Components.
-Aaron
> aaron.kempf@gmail.com wrote... > ... [quoted text clipped - 63 lines] > lists via ODBC. But when it comes to the antispreadsheet ranting, he's > just angry because he's never been able to figure out how to use them. Nick Hodge - 23 Jun 2006 07:38 GMT Aaron
Three things about OWC
1) They are ActiveX and now blocked by default in WinXP SP2 2) They are deprecated in O12 3) Any web presentation is better hosted on the server (asp, asp.net, php, etc) and presented in some of the new controls which implement XHTML, etc which give rich user views, with no security warnings and server security, without *any* need for client installed controls
 Signature HTH Nick Hodge Microsoft MVP - Excel Southampton, England www.nickhodge.co.uk nick_hodgeTAKETHISOUT@zen.co.uk.ANDTHIS
> Harlan; > [quoted text clipped - 142 lines] >> lists via ODBC. But when it comes to the antispreadsheet ranting, he's >> just angry because he's never been able to figure out how to use them. aaron.kempf@gmail.com - 23 Jun 2006 17:47 GMT Nick;
you are such a f.cking liar. they're not blocked in XP SP2.
they're not depecrated in Office 12. Office 12 hasn't shipped yet fucktard so go screw yourself.
'any web presentation is better hosted on the server'
oh i'm so glad that you are the SINGLE person that has EVERY POSSIBLE understanding of how it's soooooo inefficient to run things on the clientside.
I mean; if everything should run on the serverside; why is google and yahoo and microsoft making powerful clientside AJAX applications?
OWC are the same thing as AJAX for all practical reasons.... a richer client experience. there isn't an interface in the world; anywhere-- that can compete with OWC pivotTables for price, performance, portability and did i mention PRICE?
your simple server-side pages are for losers.
Aren't you tired of posting back to the webserver a dozen times on every page?
I personally hate how when you're typing something in windows live mail; it's talking to the server and trying to auto-complete the name of the contact that you're typing.
it's like the worst; slowest design EVER.
yeah.. 'everything should run on the server side'
kids.. sh.t
so uh.. where are flash add-ins going to run then; kid??
-Aaron
> Aaron > [quoted text clipped - 161 lines] > >> lists via ODBC. But when it comes to the antispreadsheet ranting, he's > >> just angry because he's never been able to figure out how to use them. Nick Hodge - 23 Jun 2006 18:26 GMT Aaron
For so long as your route to make your point is personal abuse to an unsatisfactory level, I'll withdraw, not because your aggression intimidates me, but because my intelligence and care for others stops me from going to your level and I think everyone now see's the 'cut of your cloth'.
Suffice to say, you will notice that Office 12 will not ship with OWC...fact, ActiveX controls will 'at best' fire the warning bar at the top in IE. The comment on server side and client side controls ends up a little like our ongoing 'discussions' on Excel/Access, each has there use and the 'cross-over' may be a little grey.
As I said....from my standpoint, whatever your reply on this....End!
 Signature HTH Nick Hodge Microsoft MVP - Excel Southampton, England www.nickhodge.co.uk nick_hodgeTAKETHISOUT@zen.co.uk.ANDTHIS
> Nick; > [quoted text clipped - 212 lines] >> >> he's >> >> just angry because he's never been able to figure out how to use them. aaron.kempf@gmail.com - 23 Jun 2006 18:36 GMT Nick;
I just disagree with you trying to spread lies
OWC isn't 'going away' it will be supported; according to what MS has told us-- until 2011-2013.
That is 'long enough' for most projects.
I dont know what Office 2007 will hold. I don't really care.
All I know is that there ISN"T a better option today for displaying and editing data.
PivotTables; Spreadsheet and Data Access Pages-- .NET can .NET compete with this stack.
-Aaron
> Aaron > [quoted text clipped - 235 lines] > >> >> he's > >> >> just angry because he's never been able to figure out how to use them. aaron.kempf@gmail.com - 23 Jun 2006 17:49 GMT and for the record; it ISN"T a client installed control-- since it comes with OFFICE FUCKTARD
and even if someone didn't have it installed; it's easy to prompt people-- a single line of plain HTML says 'click here to install' or the path to the CAB files on your webserver; it's pretty much self-installing
-Aaron
> Aaron > [quoted text clipped - 161 lines] > >> lists via ODBC. But when it comes to the antispreadsheet ranting, he's > >> just angry because he's never been able to figure out how to use them. Harlan Grove - 23 Jun 2006 23:42 GMT aaron.kempf@gmail.com wrote... ...
>and even if someone didn't have it installed; it's easy to prompt >people-- a single line of plain HTML says 'click here to install' or >the path to the CAB files on your webserver; it's pretty much >self-installing ...
So you really don't care much about security, do you? Just have users get into the habit of installing any old software from any old website? FWIW, worms, trojans, spyware are also pretty much self-installing.
dbahooker@hotmail.com - 22 Jul 2006 14:00 GMT Harlan
idiot obsolete spreadsheet grunts breed like rabbits too.
Does it mean that we should castrate people with no balls (technically the point of cutting off a penis when you dont have balls might not be necessary)
I think that we should.
I also think-- that more importantly-- we should stop teaching kids about Excel also. We don't want them to be stuck on the first step of software development-- which is what you kids are.
stuck on the first step of software development; you don't follow proven methodologies.
Maybe; if good old-fashioned Americans weren't scared of being seen 'a geek'-- maybe if real software development was more 'trendy'-- then maybe real software development companies might be able to find real qualified candidates _INSIDE_ the US.
Lose the training wheels or I'll cut off your penis.
-Aaron
> aaron.kempf@gmail.com wrote... > ... [quoted text clipped - 7 lines] > get into the habit of installing any old software from any old website? > FWIW, worms, trojans, spyware are also pretty much self-installing. Harlan Grove - 24 Jun 2006 00:43 GMT aaron.kempf@gmail.com wrote...
>So you're going to do this > >=COUNT(1/(ABS(YEAR(DateEntry)-1960)<=60)) > >For each of a trillion different cells. Excel wouldn't be a good tool to use for data entry on the order of trillions of cells. First off, I haven't heard of any multiterabyte drives for anything that could be called a PC. Second, it'd take several thousand man-years to enter a trillion cells manually. FWIW, it'd take as long to enter this many date fields manually using Access.
As for using this against a few hundred cells, if the entry cells were all in a single area range, you could use the following formula instead to check all of them at once.
=--(COUNT(1/(ABS(YEAR(DateEntryRange)-1960)<=60))=COUNT(Range))
This would also give 1s and 0s as results.
>And then you'll open up each spreadsheet by hand and if it has a 1 in >there; then you'll know that you've got to change this value by hand? > >Are you kidding me?? Playing dumb again? Or is it playing?
The point of validation formulas is to catch entry errors IMMEDIATELY after entry. If the user chooses to ignore the invalid entry, then, yes, they'd have to change it later if they wanted it to be treated as a date. Someone has to change the invalid entry, so it's left up to the user to change it.
As an alternative, you could give the date entry cell the number format Text, in which case it'd be treated as text, then use a defined name to convert it into a date using a formula definition like
=IF(COUNT(1/(ABS(YEAR(A1)-1960)<=60)),A1, IF(COUNT(--TEXT(--A1,"[<1000000]0\/0\/0000;0\/00\/0000")), --TEXT(--A1,"[<1000000]0\/0\/0000;0\/00\/0000"),"?"))
It IS possible to adapt to inconsistent data entry in Excel, but it usually isn't worth it.
>Re: >>Really? Where's the url to download it so anyone can see whether this [quoted text clipped - 3 lines] > >File, SendTo This was in response to your claim about free solutions to multiple user data entry. The 'free' part obviously ignoring the need to purchase Access for the machine designing the DAP.
>but it looks just like a webpage; it is a simple webpage. >it uses these components called 'office web components' ...
It requires that if users would be using these forms within their e-mail client, their e-mail client would need to be configured to run ActiveX controls. That eliminates any company using Lotus Notes or any other e-mail client that doesn't run ActiveX controls as well as sensible, security-conscious companies that disable ActiveX controls in e-mail clients.
>When I say 'components' think of something similiar to Adobe Acrobat >Reader; or Flash.. all it consists of is a 'Object Tag' in HTML. ...
Or Nimda.
aaron.kempf@gmail.com - 23 Jun 2006 00:59 GMT Harlan
and for the record? the thing that makes this SECURE?
it's called 'windows authentication'
you can use SQL authenticaiton if you would prefer; you can even encrpt the HTML so that it isn't readable in plain-text.
it does allow for offline-usage.. that would require something on the clientside to push the XML buffer back to the server.
-Aaron
> aaron.kempf@gmail.com wrote... > ... [quoted text clipped - 63 lines] > lists via ODBC. But when it comes to the antispreadsheet ranting, he's > just angry because he's never been able to figure out how to use them. atomiks - 23 Jun 2006 06:47 GMT Aaron -- can you tell me how I was trying to do in access? You appear to actually know what you're doing, and if it's so easy and you're so proficient, i'm actually interested in seeing how you'd do this.
> Harlan > [quoted text clipped - 77 lines] > > lists via ODBC. But when it comes to the antispreadsheet ranting, he's > > just angry because he's never been able to figure out how to use them. aaron.kempf@gmail.com - 23 Jun 2006 17:58 GMT 1) open MS Access and create a new application / database with a mdb extension if you're a newbie 2) Tools, Options, 'Pages' set up a UNC path (to a server share; or mapped network drive or something) for the 'default connection file' 3) right-click import your spreadsheet data into Access. 4) fix some of the field names; etc to not have spaces; etc 5) select the table on which you want people to enter data 6) make a view that filters WHERE USER = SUSER_SNAME() 7) select the view (oir query) in the database window 8) go INSERT, PAGE and fill out the data access page wizard. Yes, Add, Next, next, Add, Yes; etc 9) hit file, save in order to Save the DAP to the Access file that you're working with. 10) it will also prompt you where to save the plain HTML file. 11) save it to a UNC path or to your C drive or something 12) right click in Windows and go 'Send-To (mail-recipient)'
rinse and repeat.
-Aaron
> Aaron -- can you tell me how I was trying to do in access? You appear > to actually know what you're doing, and if it's so easy and you're so [quoted text clipped - 81 lines] > > > lists via ODBC. But when it comes to the antispreadsheet ranting, he's > > > just angry because he's never been able to figure out how to use them. atomiks - 22 Jun 2006 19:45 GMT Yes, each list contains the UID and UNAME fields, in addition to a few others but they aren't important for this process. Column C through F on the consolidated table just note that UID/UNAME exists on the respective source (Column C being user_ss4 and Column D being EA_list_7, for example). It can really be anything. What I've been doing is pasting each of the lists, leaving a space or two between each list. The first list of UID/UNAMEs I'd put "user_ss4" in column C, then for the next list i'd put "EA_list_7" in column D, etc...just so if I eventually pulled them all together, there wouldn't be more than one item in each column. I'll try out your suggestions now, just thought I'd note this.
Otherwise, if Access would work better, I have no aversion to using it at all.
> So each list contains just user ID (UID) and name (UNAME) fields? And > the consolidated table's col C through F would also contain the UID? [quoted text clipped - 28 lines] > table with UID and UNAME as the row variables and LID as the column > variable with operation COUNT.
|
|
|