An HDC-based compression method makes distributed classifiers lighter to share while preserving predictive power.

When Compression Does Not Mean Loss: Distributed Intelligence
Distributed artificial intelligence promises a great deal, yet it faces a concrete and often underestimated limitation: communication between nodes. When learning does not occur in a single centralized location but is instead carried out by multiple agents processing data locally and collaborating with one another, the quality of the final model depends not only on the classification algorithm. It also depends on communication costs, available bandwidth, and how effectively information exchange can be reduced without compromising predictive performance. The goal of this work is not simply to improve a distributed classifier, but to make the relationship between performance and communication overhead more controllable. To achieve this, a distributed learning model based on randomized neural networks and Hyperdimensional Computing is explored, introducing a more flexible compression strategy compared to previous approaches and benchmarking it against conventional methods such as lossless compression, dimensionality reduction, and quantization.
The Real Challenge: Learning Together Without Sharing Everything
The scenario considered involves multiple agents, each holding a portion of the data and training a local model without sharing the original samples. Each node builds its own classifier and communicates it to others, which then integrates it into a more powerful aggregated representation. In such a setting, the bottleneck is not only computational but primarily communicational. Transmitting full classifiers across nodes can quickly become costly, especially when devices are distributed, resource-constrained, or operating in edge environments. The research addresses this challenge not by eliminating collaboration, but by redefining what is shared. The focus shifts to the local classifier and how it can be compressed before transmission, so that it can later be reconstructed by other agents in a form that preserves sufficient information for effective distributed learning.
From Randomized Networks to Hyperdimensional Computing
The underlying architecture is based on Random Vector Functional Link networks, a class of randomized neural networks where the nonlinear component is fixed randomly, while learning primarily involves the final classifier. This setup is well-suited to distributed scenarios, as it reduces training complexity and allows communication to focus on a specific part of the model. On top of this foundation, the Hyperdimensional Computing paradigm is introduced, offering a different way to represent and manipulate information. Instead of treating the classifier as a numerical matrix to be transmitted in full, it is transformed into a compressed high-dimensional representation through binding and superposition operations. The result is a controlled lossy compression, which does not aim at perfect reconstruction but at preserving the information relevant for distributed classification.
A More Flexible Compression Strategy
A key contribution lies in overcoming the rigidity of earlier approaches. Previously, compression was tightly constrained by the structure of the problem; here, a more general mechanism is introduced, allowing the compression ratio to be freely selected. The local classifier is reorganised, associated with a set of random keys, and compressed into a single hypervector whose dimensionality directly reflects the desired compression level. This shift is significant because it transforms compression from a fixed design choice into an operational parameter. The system is no longer limited to a single reduction strategy but can adapt to different communication bottlenecks. In this sense, compression is not treated as a secondary technical detail, but as a design lever to regulate the trade-off between accuracy and communication cost.
Benchmarking Against Conventional Methods
To assess the effectiveness of the approach, the method is compared with three families of alternatives. The first is conventional lossless compression, used as a reference since it preserves the classifier entirely but does not allow flexible compression ratios and remains computationally demanding. The second is lossy compression based on truncated singular value decomposition, which reduces dimensionality by retaining only the principal components. The third is quantisation, analysed to evaluate how much the classifier can tolerate reduced numerical precision. The comparison highlights a key distinction. Conventional techniques can be effective as general-purpose compression tools, but they are not necessarily optimised for distributed and resource-constrained environments. In contrast, the Hyperdimensional Computing-based solution is inherently designed for such settings and provides a favourable balance between operational simplicity, reduced communication load, and maintained performance.
Experimental Validation on Real-World Data
The empirical evaluation is conducted on 18 real-world datasets from the UCI Machine Learning Repository, considering distributed scenarios with 10 and 100 agents. The experiments analyse how accuracy evolves as the compression ratio increases and compare results obtained using classifiers based on regularised least squares and centroid-based methods. The results confirm that increasing compression leads to a gradual decrease in performance, as expected. However, the proposed method remains competitive and, in several cases, outperforms SVD-based compression. A particularly interesting effect emerges as the number of agents grows: the noise introduced by decompression tends to average out during aggregation, making the method more robust precisely in highly collaborative settings. We also explore an alternative strategy: using smaller models instead of compressing larger ones. The findings do not point to a single definitive answer, but rather open an important design question. In some cases, reduced models approach the performance of compressed ones; in others, compressing the full classifier preserves more useful information. This suggests that model size and compression should be treated as complementary design tools rather than mutually exclusive choices.
Contatti
-
E-Mail:
-
Telefono:
Link utili
Autori
A. Rosato, M. Panella, E. Osipov, D. Kleyko
Agosto 21, 2021









