XGBoost is All You Need

, Data Scientist, NVIDIA
高度评价
Gradient boosted trees (GBTs) are a class of machine learning algorithms that combine a decision tree “basic learner” with the ensembling technique called boosting. Decision trees themselves are a very robust algorithm, and even though they're nonlinear, they're easy to train, very transparent, and relatively easily interpretable. These features make decision trees ideally suited for many data science tasks and applications. Furthermore, GBTs have stood the test of time in terms of their predictive power, especially for machine learning for tabular data.

XGBoost is one of the most successful implementations of GBTs. It's both an algorithm and a library, started as a research project by Tianqi Chen in 2014. For years, XGBoost has been the key component of almost all winning Kaggle solutions, as well as many important real-world applications. The library is now in version 2.0, and it’s supported on almost every computing platform and environment. In particula, it boasts a very robust GPU support. We’ll showcase some of those capabilities. In particular, we’ll demo a few lesser-know use cases for XGBoost: (1) multi-GPU and multi-machine training, (2) use of Shapley values for feature selection and feature engineering, and (3) use of XGBoost for unsupervised tasks.

We'll finish with a Q&A where we can discuss many other topics in data science, the future of work, open-source marketing, large language models, social media, etc.
活动: GTC 24
日期: March 2024
行业: All Industries
NVIDIA technology: cuML
话题: Data Analytics
级别: General
语言: English
所在地: