Tuesday, May 28, 2024
HomeData ScienceWhat it takes to be number one in the world on Kaggle?

What it takes to be number one in the world on Kaggle?

Table of Contents

  • An Interview with Guanshuo Xu: He is a Data Scientist, Kaggle Competition Grand Master (Rank 1), and PhD in Electrical Engineering

An Interview with Guanshuo Xu: He is a Data Scientist, Kaggle Competition Grand Master (Rank 1), and PhD in Electrical Engineering

In this series of interviews, data scientists and Kaggle Grandmasters at H2O.ai share their journeys , inspirations, and accomplishments to get to where they are today. These interviews are intended to motivate and encourage those who want to understand what it takes to become a Kaggle Grandmaster.

In this article, I will introduce the exchange with Mr. Guanshuo Xu . He is a Kaggle competition grandmaster and a data scientist at H2O.ai. He has a PhD in electrical and electronic engineering from the New Jersey Institute of Technology , where he studied image forensics and steganalysis using machine learning.

(*Translation Note 1) New Jersey Institute of Technology is a top-level state technical university in the United States located in New Jersey. Walter Schiller , the only astronaut to have participated in the Mercury, Gemini, and Apollo programs, is a graduate of the university.
(*Translation Note 2) (Digital) Forensics is an investigative activity that recovers or recovers information recorded on digital devices in computer crimes . Steganalysis is a research activity that discovers hidden data in digital information such as images and music data . Forensics and steganalysis are used in cybercrime investigations.

Mr. Guanshuo is a man of many achievements. His method for detecting and identifying real-world image tampering won second place in the first IEEE Image Forensics Challenge . In addition, the architectural design of deep neural networks he conceived surpassed conventional feature-based methods in image steganalysis for the first time. More recently, it won the Alaska2 Image Steganalysis and RSNA STR Pulmonary Embolism Detection competitions and achieved the #1 world ranking in the Kaggle competition tier.

We also provide a link to a video interview with him about his achievements at Kaggle, published by CTDS.show .

(*Translation Note 3) CTDS.show is a YouTube channel that interviews AI engineers around the world, run by Mr. Sanyam Bhutani, a machine learning engineer belonging to the H2O.ai India branch. “CTDS” is an acronym for Chai Time Data Science. An interview video with Daniel Bourke, a machine learning engineer living in Australia, who is often introduced in AINOW translated articles, is also available.


In this interview, we take a closer look at Guanshuo’s education, his passion for Kaggle, and his journey to becoming number one. Below are excerpts from our conversation with him.

You have a PhD in Electrical Engineering. Did that influence your decision to pursue machine learning as a career?

Guanshuo: That’s right. In my doctoral research, I used machine learning techniques to solve problems such as image tampering detection and hidden data detection. For example, my last PhD research project was using deep neural nets to analyze image steganalysis. As such, my education and research are directly related to machine learning. As such, machine learning was a natural career choice for me.


What was your encounter with Kaggle like? And what kept her motivated on the way to Grandmaster?

Guanshuo’s Kaggle Profile

Guanshuo: From the moment I discovered Kaggle, I fell in love with it. My motivation for continuing to participate in Kaggle is to win competitions and win prizes, to learn new techniques, to broaden and deepen my understanding of machine learning, and to find amazingly effective It’s a compounded satisfaction that comes from building a model and so on.


What does it feel like to be number one in the world at a competition? Do you feel extra pressure during competitions?

Current top 5 kagglers in competition category | Image Source: Kaggle website

Guanshuo: To be honest, I feel more pressure to keep it than to achieve number one. Because you want “smoother” performance. Sometimes you have to participate in more competitions at the same time than before.


How are Kaggle issues addressed?

History list of competitions Guanshuo participated in: Image source: https://www.kaggle.com/wowfattie/competitions

Guanshuo: My approach depends on the type of problem and the goal of the competition. These days, it takes days, even weeks, to understand data and problems. For example, they often infer the distribution of private test data and come up with solutions that include appropriate validation methods and detailed modeling procedures. Once you have an idea of ​​the overall approach, start coding and modeling. This process will allow us to better understand and, if necessary, modify and adjust our overall approach.


Give us a peek into your toolkit, your favorite programming language, integrated development environment, algorithms, etc.

Guanshuo: As for my toolkit, I mainly use gedit, Python , and Pytorch for deep learning.

(*Translation Note 4) gedit is the standard text editor for GNOME, one of the desktop environments . Equipped with functions for program developers, such as highlight display that supports multiple programming languages.


The field of data science is evolving rapidly. How do you keep up with the latest trends?

Guanshuo: I often find out about new things and technologies through Kaggle, my colleagues, or simply Googling. As we explore new development techniques for machine learning, we look to real-world needs. Filter out things that aren’t immediately useful and give more attention to potentially exciting things. I also try to get the information I need when I need it.


What advice would you give to people who are just starting or wanting to start their data science journey?

Guanshuo: Advice depends on the background and interests of the person listening to it. But finding the right platform for learning and upskilling generally makes things a lot easier. Participating in Kaggle competitions is also a good way to find a suitable learning platform.


Achieving the number one spot in the world is no small feat, but Guanshuo’s uncompromising attitude and hard work are truly worthy of admiration. The variety of solutions he’s won on Kaggle show a structured approach that is an integral part of problem solving.


Read other interviews in this series

  • Rohan Rao: A data scientist’s journey from Sudoku to Kaggle
  • Shivam Bansal: Data Scientist Dominating Kaggle’s “Data Science for Good” Competition
  • Meet Yauhen, the first and only Kaggle Grandmaster from Belarus
  • Sudalai Rajkumar: A Passion for Numbers Turned a Mechanical Engineer into a Kaggle Grandmaster
  • Gabor Fodor: The Inspirational Journey of ‘Beluga’ in the World of Kaggle
  • I spoke with a data scientist who has been winning on Kaggle
  • Turkish grandmaster says learning from others is key to success on K


Please enter your comment!
Please enter your name here

Recent Posts

Most Popular

Recent Comments