View Full Version : help with querys & function
i've got alot of tables tables... the main table holds all the user names & passwords. each user also has his/her own table (named like 'user_username_records').
here's what i'm trying to do: read all the user names from the main table, then loop through all the user's sub tables and get the total number of rows for each one. untimately i want to display the top 5 users with the most rows.
i've started the function, but i'm not sure what to do next.
function most_records()
{
$conn = db_connect();
$result = $conn->query("select * from users");
$num_users = $result->num_rows; // total number of users
for ($i=0; $i < $num_users; $i++)
{
$row = $result->fetch_assoc();
$username = htmlspecialchars(strtolower($row["user"]));
$user_db = "user_".$username."_records";
$query = $conn->query("select * from $user_db");
$num_records = $query->num_rows; // total number of records
echo "$username ($num_records) <br>";
}
}
i'm not sure how to get only the top 5 users.
HalfaBee
8-20-06, 11:30 PM
maaybe like this
function most_records()
{
$conn = db_connect();
$result = $conn->query("select * from users");
$num_users = $result->num_rows; // total number of users
for ($i=0; $i < $num_users; $i++)
{
$row = $result->fetch_assoc();
$username = htmlspecialchars(strtolower($row["user"]));
$user_db = "user_".$username."_records";
$query = $conn->query("select * from $user_db");
$num_records = $query->num_rows; // total number of records
echo "$username ($num_records) <br>";
$users[$row['user']]=$num_records;
}
arsort( $users );
}
mitchind
8-20-06, 11:48 PM
Ideally you should redesign your database. You've managed to bastardize what should have been a simple relational database with two tables. The second table should just have a non-unique indexed field called username which is tied to your main username table.
Then it's a simple query to get to the data you want. And minimal code.
Doesn't each username table have the same fields?
HalfaBee
8-21-06, 12:07 AM
Ideally you should redesign your database. You've managed to bastardize what should have been a simple relational database with two tables. The second table should just have a non-unique indexed field called username which is tied to your main username table.
Then it's a simple query to get to the data you want. And minimal code.
Doesn't each username table have the same fields?
I mangaed to bastardize my DB the same way, and then it had 700 tables with 500-1000 rows each, each table was updated daily with new data.
It ran a lot quicker that way when extracting the data. ;)
mitchind
8-21-06, 12:09 AM
I guess my old-school mentality is showing through?
HalfaBee
8-21-06, 12:12 AM
I guess my old-school mentality is showing through?
No, with 700,000 rows a simple query took forever, even on a dedicated server.
mitchind
8-21-06, 12:23 AM
Point taken.
There's always exceptions - and I know I've broken the rules myself occasionally.
well, i thought that i had this planned out ok... but maybe not. the big picture:
table 'records' has information about albums. each row is a different album and each row has a unique id.
table 'user_username_records' is for a specific user. each row has only a couple columns, record id and format.
it's natural that there will be repeated data since several people may have the same albums, but some may have it on CD, some LP, some MP3, etc.
does this make logical sense?
HalfaBee
8-21-06, 08:37 AM
If you just have a few records ( < 100 ) and a few users ( <500 ) one table for all users will be fine.
My case was extreme and had daily updates.
Mitchmind's suggestion was probably the way the data should be organised.
It is all a matter of maintainability, speed and access speed.
I think the code I posted will do what you need.
It at least sorts the data and you only need to pick the top 5 off the array.
i'm trying to plan ahead... the main 'records' table could end up having thousands of entries. and, in theory there could be thousands of user tables. that is why i went with this method.
ok, so i've tried to make use of your edits... here's what i've got now:
<?
$conn = db_connect();
$result = $conn->query("select * from users where confirm is not null");
$num_users = $result->num_rows; // total number of users
for ($i=0; $i < $num_users; $i++)
{
$row = $result->fetch_assoc();
$username = htmlspecialchars(strtolower($row["user"]));
$user_db = "user_".$username."_records";
$query = $conn->query("select * from $user_db");
$num_records = $query->num_rows; // total number of records
$users[$row['user']]=$num_records;
}
arsort( $users );
foreach ($users as $num_records)
{
echo "<a href=#>$username</a> // ($num_records) <br>";
}
?>
with 6 test users (each one has a different number of albums in their DB) this prints:
user6 // (8)
user6 // (4)
user6 // (2)
user6 // (2)
user6 // (1)
user6 // (0)
so, it's getting them in order... but it's not keeping the username with the correct number. i'm also stuck on how to print only the top 5. i figure there needs to be another loop in there somewhere, but i'm unsure of how to implement it.
HalfaBee
8-21-06, 05:36 PM
foreach ($users as $username => $num_records)
{
echo "<a href=#>$username</a> // ($num_records) <br>";
}
thanks so much for your help! here's the final code that worked for my needs:
$conn = db_connect();
$result = $conn->query("select * from users where confirm is not null");
$num_users = $result->num_rows; // total number of users
for ($i=0; $i < $num_users; $i++)
{
$row = $result->fetch_assoc();
$username = htmlspecialchars(strtolower($row["user"]));
$user_db = "user_".$username."_records";
$query = $conn->query("select * from $user_db");
$num_records = $query->num_rows; // total number of records
$users[$row['user']]=$num_records;
}
arsort( $users );
foreach ($users as $username => $num_records)
{
if($n<5)
{
echo "<a href=#>$username</a> // ($num_records) <br>";
$n++;
}
}
Ideally you should redesign your database. You've managed to bastardize what should have been a simple relational database with two tables. The second table should just have a non-unique indexed field called username which is tied to your main username table.
Then it's a simple query to get to the data you want. And minimal code.
Doesn't each username table have the same fields?
i'm trying to understand database structure and design better, but i have a ways to go. each user table will have the same columns: id, record_id, format & notes. but, eac user will have different info for each row.
so, by your theory could i not just add a non-unique indexed column to each user's table? would a simple JOIN work at that point?
mitchind
8-25-06, 12:43 PM
I was simply saying you should only need 1 user table - just add a column 'userid' to the table. Then, yes, a simple join will work.
I was simply saying you should only need 1 user table - just add a column 'userid' to the table. Then, yes, a simple join will work.
i do have only one user table. one table named 'users' that houses usernames & passwords for each user.
for what i'm trying to do, each user will need his/her own table to hold their respected data named 'user_username_records' (which holds ID numbers from another table called 'records').
so, either i need to do something different for the nameing on the 'user_username_records' tables, or i'm missing something altogether.
mitchind
8-25-06, 06:05 PM
I guess I should use your terminology then .. one user_record table - linked to your user table by user_id field.
Where did records table come from and what is in it now?
You only need two tables - user, user_records
Common field will be username or user_id
the big thing i'm trying to do is plan ahead for what might be a popular site. i want this site to navigate so that you browse to a record (or search for it) and add it to 'my records' (similar to the way you add friends on myspace.com).
someone on another forum suggested having one table for all the users' records like you did, but how would you organize this? it seems like that one table would be massive in just a couple months.
this site is for record collectors... so assuming i have only 100 users (being very modest), each with over 1,000 records/albums wouldn't that be too big for one table if the site ever gets more poular?
jmucchiello
8-28-06, 05:40 PM
If you are really worried about how big the site would get partition the albums table by album. (Please call the table "albums" as "record" has a database meaning that makes communication confusing.)
So:
users
pk_user integer not null identity
name varchar(50)
pwd whatever...
artist
pk_artist integer not null identity
name varchar(200)
other artist info
albums
pk_album integer not null identity
pk_artist integer
title varchar(200)
other album stuff....
user_albums
pk_user_album integer not null identity
pk_user integer not null
pk_album integer not null
format char(1)
notes varchar(200)
If you really start getting large you change it to
user_albums_1
user_albums_2
user_albums_3
etc.
and in your querries you have something like
$user_table = get_table_name($User);
"SELECT album.* from albums, $user_table ".
"WHERE pk_user = $User['PK_USER'] and albums.pk_album = $user_table.pk_album ".
"ORDER BY album.title"
The get_table_name function at first always returns "USER_ALBUMS". Later you could cut it in half and have all user_names that start with A-M be user_albums_1 and other names be user_albums_2. Etc.
The problem though is when someone wants to know what users own a specific album. Then you need to union all those tables together again.
Dividing the albums up by user is a problem because a lot of tables with a couple thousand records is not as efficient (space and timewise) as fewer tables with many records. Especially if you need to do searches across users.
Of course you also need to index all of this and you need a cron job that will optimize the database so that your indexes remain speedy.
jesus... this is way out of my league. have any suggestive reading on the subject?
i wasn't worried about the main records (or albums) table. i have it set up like:
id int(10) auto_increment Primary
artist text
album text
label text
date varchar(4)
tracks text
image varchar(14)
confirm char(1)
date_added varchar(15)
addedby_id varchar(10)
...instead of dividing it up by artist & albums. does that seem correct?
jmucchiello
8-28-06, 06:20 PM
You are going to be repeating that artist and album text a lot in the database. Databases save space only if you eliminate redundant info. If every user has the Beatles Sgt Pepper's Lonely Hearts Club Band, you will store SPLHCB on each of those records. Divide it up and only have to store the text (and the label and the artist and track listings) once.
Also, your track text is a huge waste of space if everyone who owns the same album is typing in the track titles. Highly redundant. These huge wastes of space are why your database might get slow if your site becomes popular. I didn't include tracks in my example because you would need to do it by format. Some CDs have tracks not on the Record or Cassette and vice versa.
Date_added should be a date field. Addedby_id should be an integer field. Release_date should be by format if you are including older albums in this db.
Of course the easy way to do this is to use the CDDB identifier and not store that album info at all but I'm guessing you aren't aware of that web resource.
HalfaBee
8-28-06, 07:23 PM
100,000 records isn't too big, but as it grows the queries will get slower and slower, it also depends on your row size and what you want to search on.
If you want to do stats of say the number of Beetle's albums, one table would be the best.
You are going to be repeating that artist and album text a lot in the database. Databases save space only if you eliminate redundant info. If every user has the Beatles Sgt Pepper's Lonely Hearts Club Band, you will store SPLHCB on each of those records. Divide it up and only have to store the text (and the label and the artist and track listings) once.
Also, your track text is a huge waste of space if everyone who owns the same album is typing in the track titles. Highly redundant. These huge wastes of space are why your database might get slow if your site becomes popular. I didn't include tracks in my example because you would need to do it by format. Some CDs have tracks not on the Record or Cassette and vice versa.
Date_added should be a date field. Addedby_id should be an integer field. Release_date should be by format if you are including older albums in this db.
Of course the easy way to do this is to use the CDDB identifier and not store that album info at all but I'm guessing you aren't aware of that web resource.
i'm aware of CDDB, but wasn't sure of how to access it via my web site. i only use it when i insert a CD into my computer and it automatically gets the info.
i still don't think everyone is fully understanding what i'm trying to do...
there is only one field where information is being repeated. here are my tables:
* 'records' - or albums if you prefer. this is the main table that holds info about each album. the idea is to have ALL albums in this table, once. then refer to each album by the 'id' number.
* 'users' - this holds username passwords for all users.
* 'user_whateverusername_records' - this table holds a specific user's albums... not all the info about an album JUST the 'record ID' (which is the repeated info), 'format' (CD, LP, etc.) and any arbitrary 'notes' about it.
for the date added field i was using time(). and the release date is just the year ('1967' for SPLHCB)
having a separate table created for every single one of your users is poor configuration. It may make it easier to wrap your head around it, but it's bad form. If you end up with 1000 users, you end up with 1000 tables.
The way I would've done it is with a single table to hold your record information. A unique column, a column for the username, a column for record ID, and that's about it. Then another table that holds record + format + notes, and a third table that holds your user data.
user table:
userid | username | userpassword | whatever
record table:
recordid | recordname | format type | notes | whatever
user_records table:
user_recordid | userid | recordid
quick and simple.
jmucchiello
8-29-06, 05:37 PM
there is only one field where information is being repeated. here are my tables:
* 'records' - or albums if you prefer. this is the main table that holds info about each album. the idea is to have ALL albums in this table, once. then refer to each album by the 'id' number.
* 'users' - this holds username passwords for all users.
* 'user_whateverusername_records' - this table holds a specific user's albums... not all the info about an album JUST the 'record ID' (which is the repeated info), 'format' (CD, LP, etc.) and any arbitrary 'notes' about it.Isn't this what I wrote above except I have only one join table called user_albums? If you index the single table correctly, it should not degrade in performance too much. And if you do use multiple tables, you will not be able to do querries across multiple user collections.
i'm aware of CDDB, but wasn't sure of how to access it via my web site. i only use it when i insert a CD into my computer and it automatically gets the info.Try this (http://www.freedb.org/modules.php?name=Sections&sop=listarticles&secid=2) website. Of course just a cursory look over the protocol and you cannot access it from powweb because it is not on port 80. But you could use the protocol to seed your albums table.
ok, so i've set up a user_albums table like:
id | username | record_id | format | notes
back to my original question: how do i query this table to get the top five users with the most albums?
jmucchiello
8-31-06, 04:42 AM
SELECT COUNT(*), username FROM user_albums
GROUP BY username
LIMIT 1,5
or if you want to weed out duplicate albums in multiple formats (though I am unsure if mySQL supports this construct):
SELECT COUNT(DISTINCT record_id), username FROM user_albums
GROUP BY username
LIMIT 1,5
i'm getting stuck with the PHP code to display them... here's what i've got:
$result = $conn->query("select count(*), username from user_albums group by username limit 0,5");
$num_records = $result->num_rows;
while ($row = $result->fetch_assoc())
{
echo "$row['username'] // $num_records";
}
that's printing five users, each with 5 albums in no particular order... each user should have a different number of albums.
jmucchiello
8-31-06, 05:00 PM
Sorry, forgot the order by
select count(*) as num_albums, username
from user_albums
group by username
order by 1 desc limit 0,5
echo "$row['username'] has $row['num_albums'] albums";
vBulletin v3.6.0, Copyright ©2000-2009, Jelsoft Enterprises Ltd.