PyWaveClus: Python Spike Detection and Clustering

The Python Implementation of Waveclause with artifacts removal.

The PyWaveClus pipeline consists of three main steps: spike detection, feature extraction, and clustering.

Overview

PyWaveClus is a Python package designed to analyze electrophysiological recordings. In neuroscience, spikes—brief electrical discharges from neurons—encode essential information about brain function. However, detecting and interpreting these spikes in large datasets can be complex due to noise and overlapping signals.

This package streamlines spike sorting by providing an automated pipeline that includes:

  • Spike Detection: Identifying neural spikes from noisy signals.
  • Feature Extraction: Transforming spike waveforms into numerical features for clustering.
  • Clustering: Grouping spikes likely originating from the same neuron.
1. Spike Detection

Neural spikes are often buried in noise, making robust detection essential. In PyWaveClus, spike detection relies on **wavelet-based thresholding**. The signal is first filtered, and then a **wavelet transform** is applied to isolate spikes based on transient changes in voltage.

Concept:
  • Filtering: Reduces noise by applying bandpass filters (e.g., 2 Hz to 4 kHz) to remove unwanted frequencies.
  • Wavelet Transform: Decomposes the signal into multiple frequency bands, enhancing the identification of sharp voltage changes characteristic of spikes.
  • Thresholding: Spikes are detected by comparing the wavelet coefficients to a predefined threshold.
Code Implementation:

import pywaveclus.spike_detection as sd

# Step 1: Load your electrophysiological recording (using SpikeInterface)
recording = ...            # Raw recording
recording_bp2 = ...        # Bandpass filtered at 2 Hz
recording_bp4 = ...        # Bandpass filtered at 4 kHz

# Step 2: Detect spikes
spike_detection_results = sd.detect_spikes(recording, recording_bp2, recording_bp4)

# Step 3: Extract waveforms of detected spikes
spikes_waveforms = sd.extract_waveforms(spike_detection_results, recording_bp2)

The output includes the timestamps and channels where spikes were detected. The extracted waveforms are segmented for further analysis.

2. Feature Extraction

Once spikes are detected, we need to extract numerical features to represent their key characteristics. PyWaveClus supports two main methods:

  • Haar Wavelets: Efficiently captures local variations in the waveform shape.
  • Principal Component Analysis (PCA): Reduces the dimensionality of the waveform data while preserving the most important features.
How It Works:

The spike waveform is isolated and then transformed into features. These features are compact and informative, making clustering more efficient and accurate.

Code Implementation:

import pywaveclus.feature_extraction as fe

# Step 4: Extract features from spike waveforms
features = fe.feature_extraction(spikes_waveforms)

The extracted features are ready to be used for clustering. Researchers can also choose between different feature extraction methods based on their analysis needs.


3. Clustering

Neural recordings often capture spikes from multiple neurons on a single electrode. Clustering helps distinguish spikes generated by different neurons, enabling researchers to study individual neuron activity.

PyWaveClus uses the **Super Paramagnetic Clustering (SPC)** algorithm, which is particularly effective for high-dimensional data. It automatically identifies clusters without requiring strict assumptions about cluster shape.

How It Works:
  • Feature Similarity: The algorithm groups spikes based on the similarity of their features.
  • Temperature Parameter: Clusters are formed by adjusting a temperature parameter, which influences how tightly points are grouped.
  • Cluster Assignment: Each spike is labeled with a cluster ID representing the neuron it likely originated from.
Code Implementation:

import pywaveclus.clustering as clu

# Step 5: Perform clustering on the extracted features
labels, metadata = clu.SPC_clustering(features)

The output includes cluster labels and metadata (e.g., cluster sizes and temperature maps). This allows researchers to analyze each neuron's activity separately.


Running the Full Pipeline

To simplify the workflow, PyWaveClus offers the spike_sorting_pipeline function, which automates spike detection, feature extraction, and clustering.

Code Implementation:

from pywaveclus.waveclus import spike_sorting_pipeline

OUTPUT_FOLDER = '/'
PROJECT_NAME = 'test'

def main():
    bundle_dict = ...
    recording = ...
    recording_bp2 = ...
    recording_bp4 = ...
    #Run the full pipeline
    spike_sorting_pipeline(
        recording, 
        recording_bp2, 
        recording_bp4, 
        bundle_dict,
        artifact_removal=True,
        save_dir=f'{OUTPUT_FOLDER}/{PROJECT_NAME}/'
    )

if __name__ == '__main__':
    main()

Summary

PyWaveClus is a powerful and modular tool for spike sorting in neuroscience research. It combines advanced detection, feature extraction, and clustering techniques to streamline the analysis of large-scale neural data. By integrating with SpikeInterface, the package supports various data formats and enhances reproducibility.