I have a CSV file from the Kaggle Titanic competition as follows. The record format of this file is described by the following columns: PassengerId, Survived, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, Embarked. I want to analyze the data in this file and check whether passengers traveling in a group had a better survival rate. For this I assume that the value for Ticket will be the same for all passengers in a group.
I loaded the CSV in MS Access, and executed the following query to get the desired result set:
SELECT a.Ticket, a.PassengerId, a.Survived
FROM train a
WHERE 1 < (SELECT COUNT(*) FROM train b WHERE b.Ticket = a.Ticket)
ORDER BY a.Ticket
I am not being able to extract the same result set as above, without writing a loop.