Daniel Herman
About
I love building stuff using data and code. I am a data scientist with a background in theoretical biophysics and chemical physics. I am passionate about data science, machine learning, and software development.
Recent Posts
Experience
Similarweb
Data Scientist (full time)
April 2024 - present
- The largest project is on automatic segmentation of visits using keyword searches of domains (e.g. Smartphones for Samsung). Working in a newly created team.
- Unsupervised ML / clustering / statistics / LLM / embeddings
- Python / pyspark / Data Bricks / AWS / polars
Nova TV
Data Scientist (bodyshop via Home Credit International)
August 2023 - March 2024
- The project was oriented on the short-term prediction of shares in linear TV broadcasts. Newly created small team, design of the model pipeline from scratch. Exploration for target definition. Data cleaning, setup of ETLs on keboola. Design of features and model training. Iterations with end customers.
- Supervised ML / regression / LightGBM
- Python / polars / Rust / keboola / Go / Azure
Home Credit International
Data Scientist (full time)
October 2021 - March 2024
- Skilled in developing risk models for collections and underwriting through Logistic Regression, LightGBM, and statistical tests. Resurrection of an internal package for scorecard development and improving it significantly. Experienced in speech-to-text processing utilizing TensorFlow and LightGBM for voice recognition projects. Proficient in time series prediction using SARIMAX and Prophet for confidence interval estimation and alarm triggering. Creating a comprehensive internal Spark cookbook. Also, adept at light DevOps and Linux support and serving static React websites using nginx.
- Supervised ML / classification / LightGBM / Logistic Regression / TensorFlow / PyTorch / shap / pyspark
- Python / SQL (Oracle) / polars / Linux / Azure
IOCB
Student working in academia (part time)
October 2021 - March 2024
- Using protein fragments from the PDB I developed a method for constructing small metallopro- teins.
- Python / Unsupervised ML / Research in computational chemistry
Education
Master’s degree at Charles university in Prague
Theoretical Biophysics and Chemical Physics
September 2020 - February 2023
- Conducted in-depth research on Redfield equations, resulting in a universally applicable improve- ment technique of a given set of integrodifferential equations.
- Thesis: https://github.com/detrin/Master-Thesis/blob/main/thesis.pdf