Powered by

padang-logo-RGB-colour.png

Contact us at challenge@padang.co

  • Facebook

Strategic Partners

MS-Azure_logo_stacked_c-gray_rgb.png
aws logo.png

PROBLEM STATEMENT

Economies in Southeast Asia are turning to AI to solve traffic congestion, which hinders mobility and economic growth. The first step in the push towards alleviating traffic congestion is to understand travel demand and travel patterns within the city.

 

Can we accurately forecast travel demand based on historical Grab bookings to predict areas and times with high travel demand?

SUBMISSION DEADLINE

Please submit the final repository including documentation by or before 17 June 2019,
6.00pm (SGT)
.

In this challenge, participants are to build a model trained on a historical demand dataset, that can forecast demand on a Hold-out test dataset. The model should be able to accurately forecast ahead by T+1 to T+5 time intervals (where each interval is 15-min) given all data up to time T.

The given dataset contains normalised historical demand of a city, aggregated spatiotemporally within geohashes and over 15 minute intervals. The dataset spans over a two month period. A brief description of the dataset fields are found below:

Field

Description

geohash6

geohash level 6

Geohash is a public domain geocoding system which encodes a geographic location into a short string of letters and digits with arbitrary precision. You are free to use any geohash library to encode/decode the geohashes into latitude and longitude or vice versa. Some examples include https://github.com/hkwi/python-geohash (for Python), https://github.com/kungfoo/geohash-java (for Java).

day

day, where the value indicates the sequential order and not a particular day of the month

timestamp

start time of 15-minute intervals, in the following format: <hour>:<minute>, where hour ranges from 0 to 23 and minute is either one of (0, 15, 30, 45)

demand

aggregated demand normalised to be in the range [0,1]

You will be judged on the following criteria:

Code Quality

Code Quality, also known as Software Quality, is generally defined in two ways:
 

  • How well does the code conform to the functional specifications and requirements
    of a project.

  • Structural quality, which relates to the maintainability and robustness of the code.

Creativity in Problem-solving

Creativity speaks volumes about your capability to make sense of given data, derive tangible results relevant to the business needs of an organization and present the findings. All this, while keeping in mind the problem statements.

 

Check out our thought process behind these challenges in our short film!

Feature Engineering

 

Feature Engineering, also referred to as pre-processing, refers to the process of selecting and transforming variables when creating a data model for a given problem statement. While you will be given a general dataset which relates to the problem statement, you need to create “features” that make the models and algorithms work as intended.

 

Note that your code should be able to automatically create your desired features, that can be used in the evaluation of the Hold-out test set.

Model Performance
 

Model performance determines how a model represents the data and how well the chosen model will work. In this challenge, we will be performing a Hold-out model evaluation. For this problem, you are given a training dataset, and our evaluators will have a test dataset (not seen by the model). This test dataset will assess the likely future performance of the model.
 

Test dataset details:

1. Timeframe: The test dataset can start from any time period after the timeframe of the training dataset. Your model can use features of up to 14 consecutive days from the test dataset, ending at timestamp T and predict T+1 to T+5.


2. Geohash coverage: You may assume that the set of geohashes are the same in training dataset and test dataset. The original geohashes are anonymised, but you may assume that adjacency is maintained between the geohashes.


Submissions will be evaluated by RMSE (root mean squared error) averaged over all geohash6, 15-minute-bucket pairs.

QUALIFICATION CRITERIA

  • Submit the correct link to your repository

  • Make sure your repository includes the complete codebase (all the commits are done, documentation, complete, etc)

  • Solve only one of the challenges mentioned on the website

  • Do not plagiarise the code. That will be grounds for instant disqualification

  • The link to your repository must be publicly accessibly from the time of submission.

SUBMISSION GUIDELINES

You can submit the code (either as a codebase or a Jupyter notebook) by uploading it to a public Github or similar repository. The instructions to submit the repository link will be sent to you via email once you accept the challenge on https://www.aiforsea.com/