DS@GT Applied Research Competitions
The Data Science @ Georgia Tech (DS@GT) Applied Research Competitions (ARC) is a student-run research group focused on machine learning, information retrieval, and data-driven scientific modeling through participation in competitive research challenges. The group organizes participation in the Conference and Labs of the Evaluation Forum (CLEF) but also leverages platforms like Kaggle and PACE for training and internal skill development. Our code is available on GitHub.
We recently wrapped up our Spring 2025 semester with 22 accepted working note papers and 44 authors at CLEF 2025. We are preparing for the Fall 2025 Interest Group and Spring 2026 competition season. In addition to CLEF 2026, we plan to participate in TREC 2025, MediaEval 2025, and NTCIR 19.
Membership
ARC operates as a project group within the broader Data Science @ Georgia Tech student organization. Members are expected to be members of DS@GT and adhere to its general guidelines.
- Eligibility: Open to all Georgia Tech students (undergraduate, graduate - including OMSCS/OMSA, PhD), and alumni with student status (e.g., enrolled in a for-credit seminar). Both on-campus and online students participate actively.
- Requirements: Members must be part of the parent DS@GT organization (including paying dues). Active participation, especially in the Fall, is crucial for Spring team placement.
- Minimum Technical Expectations: Proficiency in Python (SciPy stack: NumPy, Pandas, Matplotlib) and Git version control. Familiarity with ML concepts is highly beneficial. Prior ML/IR project experience (non-trivial complexity) or software engineering experience is expected. Completion or enrollment in a project-heavy course (e.g., ML, DL) is recommended.
Group Structure and Schedule
Recordings of our Fall 2024 Interest Group can be found below and provide an overview of the group’s structure and expectations.
The group operates on a two-semester academic cycle.
- Fall Semester: Interest Group & Preparation
- Focus: Introduction to competitive data science (Kaggle) and research competitions (CLEF).
- Activities: Weekly meetings, EDA assignments, paper discussions, foundational skills training (e.g., PACE usage).
- Outcome: Formation of motivated and prepared teams for the Spring semester.
- Time Commitment: ~2-3 hours/week (equivalent to a 1-unit seminar).
- Spring Semester: Competition Execution & Publication
- Focus: Deep dive into specific CLEF tasks within dedicated teams.
- Activities: Team-based research and development, model building, experimentation, result submission, writing and submitting working notes papers.
- Outcome: Competition submissions, published papers, presentation of work.
- Time Commitment: ~100-150+ hours total (equivalent to a 2-3 unit course), varies by role and project intensity.