Skip to content
View xieh97's full-sized avatar
:octocat:
I may be slow to respond.
:octocat:
I may be slow to respond.
  • 11:57 (UTC +02:00)

Block or report xieh97

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xieh97/README.md

πŸ‘‹ Hi, I'm Huang Xie (谒晃)

"Science is an error-correcting process." β€” Charles S. Peirce

πŸŽ“ About Me

I am a Machine Learning Researcher and PhD candidate at Tampere University, specializing in Signal Processing and Machine Learning.

My research focuses on multimodal learning, representation learning, and audio understanding. My work involves developing and optimizing deep learning models for audio classification, sound event detection, multimodal alignment/grounding, and cross-modal information retrieval.

🧠 Research Interests

  • 🎧 Machine Learning for Audio Understanding (classification, detection, retrieval, generation)
  • πŸ” Self-Supervised Representation Learning
  • πŸ”„ Multimodal Learning (audio + text + image + video)
  • 🧩 Low-Resource Learning (zero-shot, few-shot)

πŸ› οΈ Tech Stack

  • πŸ’» Programming: Python, Java, JavaScript, SQL, GDScript
  • βš›οΈ Machine Learning: PyTorch, TensorFlow, scikit-learn, Ray Tune, MLflow
  • πŸ—£οΈ Audio & NLP: librosa, torchaudio, NLTK
  • πŸ“Š Data Analysis: NumPy, SciPy, Pandas, Jupyter, Matplotlib
  • 🌐 Web & Backend: Java EE, Spring, Hibernate, Django, Flask, Gradio
  • πŸ“± GUI & Game Development: PySide6, Godot Engine
  • βš™οΈ Databases & DevOps: MySQL, PostgreSQL, Linux, Docker, Git

πŸ§ͺ Featured Projects

πŸ’¬ Let's Connect

Happy to discuss multimodal ML, applied AI, or the challenges of building scalable AI systems. Whether you're hacking on a side project, exploring new ideas, or working in research β€” feel free to reach out, I'd love to exchange thoughts!

πŸ“« Email: huang.xie@outlook.com
πŸ”— Google Scholar: scholar.google.com/citations?user=_wmP81AAAAAJ
πŸ”— LinkedIn: linkedin.com/in/huang-xie-28b7872bb

Popular repositories Loading

  1. dcase2023-audio-retrieval dcase2023-audio-retrieval Public

    Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge

    Python 10 3

  2. dcase2022-audio-retrieval dcase2022-audio-retrieval Public

    Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2022 Challenge

    Python 8 1

  3. contrastive-negative-sampling contrastive-negative-sampling Public

    Source code for negative sampling for contrastive audio-text retrieval (ICASSP 2023)

    Python 3

  4. language-based-audio-retrieval language-based-audio-retrieval Public

    List of academic resources on Language-Based Audio Retrieval

    2

  5. audiocaps-dl audiocaps-dl Public

    Python program to download AudioCaps from YouTube.com

    Python 1

  6. audio-text-semantic-alignment audio-text-semantic-alignment Public

    Source code for audio-text semantic alignment (ICASSP 2022)

    Python