This is a query that returns the students registered for a course, and identifies those who are marked as "arrived" based on their accommodation (room at the retreat). However, some students may have more than one accommodation (because they are going to move from one room to another during their stay). In my left join query, the query correctly returns a row if there is no match on the right side, but it returns multiple rows if there is more than one match on the right side. DISTINCT and GROUP BY can't be used on the right side. What I need is for the query to return one row for a match on the left side, regardless of if or how many matches there are on the right side.
Thanks for any help.
SELECT A.student_id, A.invoice_date, A.invoice_set, A.last_name, A.first_name, A.addon_last_name, A.addon_first_name, A.gender, B.arrived, B.invoice_id AS arrived_invoice_id, A.invoice_id FROM (SELECT * FROM students.student_id, invoice_date, invoice_set, invoice_id, last_name, first_name, addon_last_name, addon_first_name, invoice.gender FROM invoice, students WHERE product_id = '$course_id' AND product_type = 'COURSE' AND commit_invoice='1' AND students.student_id=invoice.student_id)A LEFT JOIN invoice AS B ON A.student_id=B.student_id AND A.invoice_set=B.invoice_set AND ((A.addon_last_name=B.addon_last_name AND A.addon_first_name=B.addon_first_name) OR (A.addon_last_name IS NULL AND B.addon_last_name IS NULL)) AND B.product_type='LODGING' AND B.commit_invoice='1' ORDER BY A.invoice_date DESC
LEFT JOIN returning duplicate rows
Moderator: General Moderators
Re: LEFT JOIN returning duplicate rows
It looks like you're trying to get the arrival information for the students? So there should only be one logical row in B matching that - the others would be transfers and not arrivals. If so, you should only be joining in rows from B that represent an arrival.
Unless you have something in B that indicates arrival vs transfer (I didn't see anything in that query), and you can't add that (I think that'd be a good idea), then
1. Write a subquery to get the earliest arrival time from the invoice table, assuming the arrival time is the best way to figure which is the arrival and which is a transfer.
2. Join that subquery against A and B so that B only matches on the arrival time. This then returns to you only one row from B and that row is for the arrival.
Unless you have something in B that indicates arrival vs transfer (I didn't see anything in that query), and you can't add that (I think that'd be a good idea), then
1. Write a subquery to get the earliest arrival time from the invoice table, assuming the arrival time is the best way to figure which is the arrival and which is a transfer.
2. Join that subquery against A and B so that B only matches on the arrival time. This then returns to you only one row from B and that row is for the arrival.
Re: LEFT JOIN returning duplicate rows
This question was answered on the SitePoint forum. Requinix, thanks much for your contribution.
If anyone is curious, this is the revised query that solves the issue:
$sql=$dbh->prepare("SELECT A.student_id
, A.invoice_date
, A.invoice_set
, A.last_name
, A.first_name
, A.addon_last_name
, A.addon_first_name
, A.gender
, B.arrived
, B.invoice_id AS arrived_invoice_id
, A.invoice_id
FROM ( SELECT students.student_id
, invoice_date
, invoice_set
, invoice_id
, last_name
, first_name
, addon_last_name
, addon_first_name
, invoice.gender
FROM invoice
, students
WHERE product_id = '$course_id'
AND product_type = 'COURSE'
AND commit_invoice='1'
AND students.student_id=invoice.student_id ) A
LEFT OUTER
JOIN ( SELECT student_id
, invoice_set
, addon_last_name
, addon_first_name
, MAX(invoice_id) AS latest
FROM invoice
WHERE product_type='LODGING'
AND commit_invoice='1'
GROUP
BY student_id
, invoice_set
, addon_last_name
, addon_first_name ) AS M
ON M.student_id = A.student_id
AND M.invoice_set = A.invoice_set
AND (
( M.addon_last_name = A.addon_last_name
AND M.addon_first_name = A.addon_first_name
)
OR ( M.addon_last_name IS NULL
AND A.addon_last_name IS NULL
)
)
LEFT OUTER
JOIN invoice AS B
ON B.invoice_id = M.latest
ORDER
BY A.invoice_date DESC")
If anyone is curious, this is the revised query that solves the issue:
$sql=$dbh->prepare("SELECT A.student_id
, A.invoice_date
, A.invoice_set
, A.last_name
, A.first_name
, A.addon_last_name
, A.addon_first_name
, A.gender
, B.arrived
, B.invoice_id AS arrived_invoice_id
, A.invoice_id
FROM ( SELECT students.student_id
, invoice_date
, invoice_set
, invoice_id
, last_name
, first_name
, addon_last_name
, addon_first_name
, invoice.gender
FROM invoice
, students
WHERE product_id = '$course_id'
AND product_type = 'COURSE'
AND commit_invoice='1'
AND students.student_id=invoice.student_id ) A
LEFT OUTER
JOIN ( SELECT student_id
, invoice_set
, addon_last_name
, addon_first_name
, MAX(invoice_id) AS latest
FROM invoice
WHERE product_type='LODGING'
AND commit_invoice='1'
GROUP
BY student_id
, invoice_set
, addon_last_name
, addon_first_name ) AS M
ON M.student_id = A.student_id
AND M.invoice_set = A.invoice_set
AND (
( M.addon_last_name = A.addon_last_name
AND M.addon_first_name = A.addon_first_name
)
OR ( M.addon_last_name IS NULL
AND A.addon_last_name IS NULL
)
)
LEFT OUTER
JOIN invoice AS B
ON B.invoice_id = M.latest
ORDER
BY A.invoice_date DESC")