Veritas AI

View Original

10 BitGrit Competitions to Check Out As a High School Student

Data science competitions are an excellent way for high school students to develop valuable skills, gain hands-on experience, and build impressive portfolios that stand out in college applications. Competitions like those offered on BitGrit allow students to tackle real-world problems, collaborate with other participants, and apply concepts learned in class to practical scenarios. By participating in these challenges, students not only enhance their coding and analytical skills but also get a taste of what it's like to work on complex projects in fields such as AI, finance, and environmental science. Here’s a deeper look into ten BitGrit competitions that can provide both an educational experience and the chance to win prizes.

1. NASA Breath Diagnostics Challenge

The NASA Breath Diagnostics Challenge pushes participants to innovate in the field of medical diagnostics by enhancing NASA's electronic nose (E-Nose) technology. This competition is particularly suited for students interested in biomedical engineering, AI, and healthcare, offering them the chance to work with breath analysis data that could lead to groundbreaking advancements in disease detection. By participating, students get to work on complex data involving chemical signatures and contribute to the future of non-invasive medical diagnostics.

Key Details:

  • Dataset:The dataset comprises 63 text files, each for a different patient, containing: Patient ID; COVID-19 Diagnosis (POSITIVE or NEGATIVE) Sensor Readings: Measurements from 64 sensors (D1 to D64) in an E-Nose device, with timestamps in Min format. Data collection involved a 14-minute process with ambient air and breath samples, including calibration phases. The dataset is split into: Training Set: 45 patients, Test Set: 18 patients

  • Required Skills: Understanding chemical data and familiarity with machine learning techniques for classification problems.

  • Cost: Free to participate.

  • Prizes: $55,000, making it a highly lucrative challenge for young innovators.

  • Competition Dates: July - September (tentatively, based on previous years).

2. Bird Species Classification Challenge

The Bird Species Classification Challenge engages participants in building machine learning models that classify bird species based on various attributes, including physical characteristics and geographic location. It’s an ideal competition for those passionate about ecology, conservation, and AI, as it combines these fields into a single, rewarding task. Competitors learn to handle image and categorical data, a skill highly relevant for environmental data analysis and conservation efforts.

Key Details:

  • Dataset: The competition features two datasets: train and test, both containing bird data for locations 1 to 3. The training_set and training_target datasets can be merged using the 'id' column. Your task is to develop an algorithm to predict the "Species" in training_target.csv.

  • Required Skills: Image recognition skills and understanding of species differentiation are beneficial.

  • Cost: Free to participate.

  • Prizes: The primary focus is on the educational experience rather than financial rewards.

  • Competition Dates: Starts in May (tentatively, based on previous years) and is ongoing.

3. Weather Forecast Challenge

Participants in the Weather Forecast Challenge develop models to predict weather patterns that can be used to address food security challenges globally. This competition is perfect for students with a keen interest in environmental science, statistics, and machine learning. The datasets often include historical weather data, which participants use to train models that can make accurate forecasts, a critical tool in agriculture and disaster management.

Key Details:

  • Dataset: The competition includes two datasets: train and test, both containing weather data for regions A through E. These regions and the target region for weather prediction are neighboring, but their exact locations are not provided. You can join the datasets with the solution_format.csv using the 'date' column. Your objective is to develop an algorithm to predict the "label" in solution_format.csv, noting that all values in this file are dummy values.

  • Required Skills: A good understanding of time-series analysis and predictive modeling is beneficial.

  • Cost: Free to participate.

  • Prizes: Recognition and learning-focused.

  • Competition Dates: Starts in May (tentatively, based on previous years) and is ongoing. 

4. SoftBank Forex Algorithm Challenge

The SoftBank Forex Algorithm Challenge involves creating algorithms for foreign exchange risk management and trading. It’s an excellent fit for students interested in finance, economics, and data science. The challenge allows participants to explore algorithmic trading, a high-demand skill in the financial sector. Competitors need to understand market dynamics and the stochastic nature of forex data, making it an engaging task for those looking to delve into quantitative finance.

Key Details:

  • Dataset: In this competition, you are tasked with predicting the end-of-month exchange rate using a combination of market and economic news data. The dataset includes a training set, a test set, and folders containing text features. Key variables provided are an encrypted date identifier (id), the number of days to the target date (span), and the exchange rate on the target date relative to the current rate (target). 

  • Required Skills: Knowledge of financial markets and algorithm design is essential.

  • Cost: Free to participate.

  • Prizes: $10,000.

  • Competition Dates: To be announced for 2024-25.

5. Viral Tweets Prediction Challenge

In the Viral Tweets Prediction Challenge, participants create models to predict which tweets are likely to go viral. This competition is particularly engaging for those interested in social media analytics, marketing, and AI. Competitors work with data that includes tweet text, metadata, and engagement metrics, using natural language processing (NLP) techniques to identify patterns that drive viral content.

Key Details:

  • Dataset: To build your machine learning model, you'll use the users.csv dataset, which provides key information about Twitter accounts. This dataset includes details such as the account ID, number of likes, followers, and tweets, whether the account has a location or URL, its verification status, and creation date. These features will be crucial for analyzing and modeling Twitter account behaviors.

  • Required Skills: Skills in NLP and understanding of social media dynamics are advantageous.

  • Cost: Free to participate.

  • Prizes: $3,000.

  • Competition Dates: To be announced for 2024-25.

6. Shipping Optimization Challenge

The Shipping Optimization Challenge revolves around optimizing logistics processes, specifically forecasting delivery times and shipment quantities. This competition is ideal for students interested in supply chain management, logistics, or operations research. It offers an introduction to real-world applications of predictive modeling and optimization algorithms. Participants gain valuable experience working with datasets that include delivery routes, time logs, and shipment volumes.

Key Details:

  • Dataset: In this competition segment, you'll forecast the Shipping Time required to process each shipment, measured in days with up to 5 decimal places (e.g., "5.3" or "5.45673"). Use the train_2_pr.csv file, which includes historical shipment data and known Shipping Times, to train your model. The goal is to predict Shipping Times for each Shipment ID in the test_2.csv file. The train_2_pr.csv dataset covers shipments from February 14, 2019, to June 13, 2020.

  • Required Skills: Familiarity with optimization algorithms and linear programming can be beneficial.

  • Cost: Free to participate.

  • Prizes: $7,500.

  • Competition Dates: To be announced for 2024-25.

7. Generative AI Competition

Focusing on the rapidly evolving field of generative AI, this Generative AI Competition challenges participants to detect AI-generated photos. This is crucial in today’s landscape, where distinguishing real from synthetic content has significant ethical and security implications. The competition is excellent for students who wish to delve into deep learning and generative adversarial networks (GANs).

Key Details:

  • Dataset: The goal of this competition is to develop a model that can distinguish between real and AI-generated fake photos. You will use train.csv to train your model, which includes labeled data where 0 represents 'real' and 1 represents 'fake'. Evaluate your model's performance on test.csv, which contains unseen data for prediction. Ensure your submission follows the format provided in solution_format for accurate scoring.

  • Required Skills: Experience with deep learning frameworks and image processing is helpful.

  • Cost: Free to participate.

  • Prizes: $3,000.

  • Competition Dates: April (tentatively, based on previous years).

8. Cryptocurrency Price Prediction Challenge

In the Cryptocurrency Price Prediction Challenge, participants tackle the challenge of predicting cryptocurrency prices, a task that combines financial analysis with machine learning. It’s an excellent way for students to explore the intersection of data science and finance, working with highly volatile and complex datasets. Participants can test their models on real-world market data, learning valuable lessons about market dynamics and predictive accuracy.

Key Details:

  • Dataset: The goal of this competition is to predict whether cryptocurrency prices will rise (1) or fall (0) two weeks from the prediction date, based on the "Target" column. You will use several types of data: ID for unique identification of each data point, Target to indicate price changes, feature_x_y for market data variables, TR_x_EventInd to represent events that may affect prices, and index_1 ~ index_3 for search trends related to cryptocurrencies.

  • Required Skills: Understanding of financial modeling and time-series forecasting.

  • Cost: Free to participate.

  • Prizes: $3,000.

  • Competition Dates: March (tentatively, based on previous years).

9. Dog Breed Prediction Competition

This Dog Breed Prediction Competition involves predicting dog breeds from images, making it an accessible introduction to image classification and computer vision. It’s ideal for students new to AI, providing a practical project that combines basic image processing techniques with the excitement of working with a fun dataset.

Key Details:

  • Dataset: In this competition, you'll work with data on individual canines, where pet_id serves as the primary identifier. The dataset Dog_Breed_trainingdata.csv includes additional labels like breed, color, sex, and age. Each pet_id is linked to an image file with the same ID (e.g., pet_id 100100 corresponds to 100100.jpg). Additionally, breed_id denotes specific breeds, such as breed_id 1 for Affenpinscher.

  • Required Skills: Basic knowledge of convolutional neural networks (CNNs) and image classification.

  • Cost: Free to participate.

  • Prizes: $275.

  • Competition Dates: To be announced for 2024-25.

10. Precision Farming Challenge

Precision farming uses technology to increase agricultural efficiency, and this competition challenges participants to detect field boundaries using satellite imagery. It’s perfect for students interested in agriculture, sustainability, and remote sensing, offering the chance to work with geospatial data and learn about applications of AI in environmental management.

Key Details:

  • Dataset: Your task is to detect the contour boundaries of cultivated fields from satellite images. The solution should process the satellite image to produce a monochromatic output of the same resolution, displaying only the contour outlines along with the geographic metadata. Utilize the most effective spectral band(s), pre-processing techniques, and contour detection strategies to achieve accurate results. The solution should be applicable to both the provided sample data and new datasets.

  • Required Skills: Geospatial analysis and familiarity with image processing.

  • Cost: Free to participate.

  • Prizes: $275.

  • Competition Dates: To be announced for 2024-25.

These competitions not only offer an engaging way to apply data science skills but also provide tangible outputs that students can showcase in their portfolios. Whether interested in finance, environmental science, healthcare, or AI ethics, there’s something for everyone on BitGrit. Head to the BitGrit competition page to explore these opportunities further and start competing today!


If you’re looking to build your AI & data science skills and work on a unique, independent project/research paper in the field of AI & ML, consider applying to Veritas AI!

Veritas AI is founded by Harvard graduate students. Through the programs, you get a chance to work 1-1 with mentors from universities like Harvard, Stanford, MIT, and more to create unique, personalized projects. In the past year, we had over 1000 students learn AI & ML with us. You can apply here!



Image Source: NASA Logo