Prev: 2022.04.08 Next: 2022.04.10

Summary for 2022-04-09, created on 2022-04-19

Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification arxiv:2204.04567 📈 7

Jiangtao Xie, Fei Long, Jiaming Lv, Qilong Wang, Peihua Li

**Abstract:** Few-shot classification is a challenging problem as only very few training examples are given for each new task. One of the effective research lines to address this challenge focuses on learning deep representations driven by a similarity measure between a query image and few support images of some class. Statistically, this amounts to measure the dependency of image features, viewed as random vectors in a high-dimensional embedding space. Previous methods either only use marginal distributions without considering joint distributions, suffering from limited representation capability, or are computationally expensive though harnessing joint distributions. In this paper, we propose a deep Brownian Distance Covariance (DeepBDC) method for few-shot classification. The central idea of DeepBDC is to learn image representations by measuring the discrepancy between joint characteristic functions of embedded features and product of the marginals. As the BDC metric is decoupled, we formulate it as a highly modular and efficient layer. Furthermore, we instantiate DeepBDC in two different few-shot classification frameworks. We make experiments on six standard few-shot image benchmarks, covering general object recognition, fine-grained categorization and cross-domain classification. Extensive evaluations show our DeepBDC significantly outperforms the counterparts, while establishing new state-of-the-art results. The source code is available at http://www.peihuali.org/DeepBDC

Super-Resolved Microbubble Localization in Single-Channel Ultrasound RF Signals Using Deep Learning arxiv:2204.04537 📈 6

Nathan Blanken, Jelmer M. Wolterink, Hervé Delingette, Christoph Brune, Michel Versluis, Guillaume Lajoinie

**Abstract:** Recently, super-resolution ultrasound imaging with ultrasound localization microscopy (ULM) has received much attention. However, ULM relies on low concentrations of microbubbles in the blood vessels, ultimately resulting in long acquisition times. Here, we present an alternative super-resolution approach, based on direct deconvolution of single-channel ultrasound radio-frequency (RF) signals with a one-dimensional dilated convolutional neural network (CNN). This work focuses on low-frequency ultrasound (1.7 MHz) for deep imaging (10 cm) of a dense cloud of monodisperse microbubbles (up to 1000 microbubbles in the measurement volume, corresponding to an average echo overlap of 94%). Data are generated with a simulator that uses a large range of acoustic pressures (5-250 kPa) and captures the full, nonlinear response of resonant, lipid-coated microbubbles. The network is trained with a novel dual-loss function, which features elements of both a classification loss and a regression loss and improves the detection-localization characteristics of the output. Whereas imposing a localization tolerance of 0 yields poor detection metrics, imposing a localization tolerance corresponding to 4% of the wavelength yields a precision and recall of both 0.90. Furthermore, the detection improves with increasing acoustic pressure and deteriorates with increasing microbubble density. The potential of the presented approach to super-resolution ultrasound imaging is demonstrated with a delay-and-sum reconstruction with deconvolved element data. The resulting image shows an order-of-magnitude gain in axial resolution compared to a delay-and-sum reconstruction with unprocessed element data.

IDPG: An Instance-Dependent Prompt Generation Method arxiv:2204.04497 📈 6

Zhuofeng Wu, Sinong Wang, Jiatao Gu, Rui Hou, Yuxiao Dong, V. G. Vinod Vydiswaran, Hao Ma

**Abstract:** Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage. It freezes the pre-trained language model and only optimizes a few task-specific prompts. In this paper, we propose a conditional prompt generation method to generate prompts for each input instance, referred to as the Instance-Dependent Prompt Generation (IDPG). Unlike traditional prompt tuning methods that use a fixed prompt, IDPG introduces a lightweight and trainable component to generate prompts based on each input sentence. Extensive experiments on ten natural language understanding (NLU) tasks show that the proposed strategy consistently outperforms various prompt tuning baselines and is on par with other efficient transfer learning methods such as Compacter while tuning far fewer model parameters.

Robust Cross-Modal Representation Learning with Progressive Self-Distillation arxiv:2204.04588 📈 5

Alex Andonian, Shixing Chen, Raffay Hamid

**Abstract:** The learning objective of vision-language approach of CLIP does not effectively account for the noisy many-to-many correspondences found in web-harvested image captioning datasets, which contributes to its compute and data inefficiency. To address this challenge, we introduce a novel training framework based on cross-modal contrastive learning that uses progressive self-distillation and soft image-text alignments to more efficiently learn robust representations from noisy data. Our model distills its own knowledge to dynamically generate soft-alignment targets for a subset of images and captions in every minibatch, which are then used to update its parameters. Extensive evaluation across 14 benchmark datasets shows that our method consistently outperforms its CLIP counterpart in multiple settings, including: (a) zero-shot classification, (b) linear probe transfer, and (c) image-text retrieval, without incurring added computational cost. Analysis using an ImageNet-based robustness test-bed reveals that our method offers better effective robustness to natural distribution shifts compared to both ImageNet-trained models and CLIP itself. Lastly, pretraining with datasets spanning two orders of magnitude in size shows that our improvements over CLIP tend to scale with number of training examples.

Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains arxiv:2204.04534 📈 5

Chenyang Lyu, Jennifer Foster, Yvette Graham

**Abstract:** Past works that investigate out-of-domain performance of QA systems have mainly focused on general domains (e.g. news domain, wikipedia domain), underestimating the importance of subdomains defined by the internal characteristics of QA datasets. In this paper, we extend the scope of "out-of-domain" by splitting QA examples into different subdomains according to their several internal characteristics including question type, text length, answer position. We then examine the performance of QA systems trained on the data from different subdomains. Experimental results show that the performance of QA systems can be significantly reduced when the train data and test data come from different subdomains. These results question the generalizability of current QA systems in multiple subdomains, suggesting the need to combat the bias introduced by the internal characteristics of QA datasets.

Why did I fail? A Causal-based Method to Find Explanations for Robot Failures arxiv:2204.04483 📈 5

Maximilian Diehl, Karinne Ramirez-Amaro

**Abstract:** Robot failures in human-centered environments are inevitable. Therefore, the ability of robots to explain such failures is paramount for interacting with humans to increase trust and transparency. To achieve this skill, the main challenges addressed in this paper are I) acquiring enough data to learn a cause-effect model of the environment and II) generating causal explanations based on that model. We address I) by learning a causal Bayesian network from simulation data. Concerning II), we propose a novel method that enables robots to generate contrastive explanations upon task failures. The explanation is based on setting the failure state in contrast with the closest state that would have allowed for successful execution, which is found through breadth-first search and is based on success predictions from the learned causal model. We assess the sim2real transferability of the causal model on a cube stacking scenario. Based on real-world experiments with two differently embodied robots, we achieve a sim2real accuracy of 70% without any adaptation or retraining. Our method thus allowed real robots to give failure explanations like, 'the upper cube was dropped too high and too far to the right of the lower cube.'

Adaptive Differential Filters for Fast and Communication-Efficient Federated Learning arxiv:2204.04424 📈 5

Daniel Becking, Heiner Kirchhoffer, Gerhard Tech, Paul Haase, Karsten Müller, Heiko Schwarz, Wojciech Samek

**Abstract:** Federated learning (FL) scenarios inherently generate a large communication overhead by frequently transmitting neural network updates between clients and server. To minimize the communication cost, introducing sparsity in conjunction with differential updates is a commonly used technique. However, sparse model updates can slow down convergence speed or unintentionally skip certain update aspects, e.g., learned features, if error accumulation is not properly addressed. In this work, we propose a new scaling method operating at the granularity of convolutional filters which 1) compensates for highly sparse updates in FL processes, 2) adapts the local models to new data domains by enhancing some features in the filter space while diminishing others and 3) motivates extra sparsity in updates and thus achieves higher compression ratios, i.e., savings in the overall data transfer. Compared to unscaled updates and previous work, experimental results on different computer vision tasks (Pascal VOC, CIFAR10, Chest X-Ray) and neural networks (ResNets, MobileNets, VGGs) in uni-, bidirectional and partial update FL settings show that the proposed method improves the performance of the central server model while converging faster and reducing the total amount of transmitted data by up to 377 times.

Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering arxiv:2204.04581 📈 4

Wenhu Chen, Pat Verga, Michiel de Jong, John Wieting, William Cohen

**Abstract:** Retrieval augmented language models have recently become the standard for knowledge intensive tasks. Rather than relying purely on latent semantics within the parameters of large neural models, these methods enlist a semi-parametric memory to encode an index of knowledge for the model to retrieve over. Most prior work has employed text passages as the unit of knowledge, which has high coverage at the cost of interpretability, controllability, and efficiency. The opposite properties arise in other methods which have instead relied on knowledge base (KB) facts. At the same time, more recent work has demonstrated the effectiveness of storing and retrieving from an index of Q-A pairs derived from text \citep{lewis2021paq}. This approach yields a high coverage knowledge representation that maintains KB-like properties due to its representations being more atomic units of information. In this work we push this line of research further by proposing a question-answer augmented encoder-decoder model and accompanying pretraining strategy. This yields an end-to-end system that not only outperforms prior QA retrieval methods on single-hop QA tasks but also enables compositional reasoning, as demonstrated by strong performance on two multi-hop QA datasets. Together, these methods improve the ability to interpret and control the model while narrowing the performance gap with passage retrieval systems.

Self-Labeling Refinement for Robust Representation Learning with Bootstrap Your Own Latent arxiv:2204.04545 📈 4

Siddhant Garg, Dhruval Jain

**Abstract:** In this work, we have worked towards two major goals. Firstly, we have investigated the importance of Batch Normalisation (BN) layers in a non-contrastive representation learning framework called Bootstrap Your Own Latent (BYOL). We conducted several experiments to conclude that BN layers are not necessary for representation learning in BYOL. Moreover, BYOL only learns from the positive pairs of images but ignores other semantically similar images in the same input batch. For the second goal, we have introduced two new loss functions to determine the semantically similar pairs in the same input batch of images and reduce the distance between their representations. These loss functions are Cross-Cosine Similarity Loss (CCSL) and Cross-Sigmoid Similarity Loss (CSSL). Using the proposed loss functions, we are able to surpass the performance of Vanilla BYOL (71.04%) by training the BYOL framework using CCSL loss (76.87%) on the STL10 dataset. BYOL trained using CSSL loss performs comparably with Vanilla BYOL.

Efficient Extraction of Pathologies from C-Spine Radiology Reports using Multi-Task Learning arxiv:2204.04544 📈 4

Arijit Sehanobish, Nathaniel Brown, Ishita Daga, Jayashri Pawar, Danielle Torres, Anasuya Das, Murray Becker, Richard Herzog, Benjamin Odry, Ron Vianu

**Abstract:** Pretrained Transformer based models finetuned on domain specific corpora have changed the landscape of NLP. Generally, if one has multiple tasks on a given dataset, one may finetune different models or use task specific adapters. In this work, we show that a multi-task model can beat or achieve the performance of multiple BERT-based models finetuned on various tasks and various task specific adapter augmented BERT-based models. We validate our method on our internal radiologist's report dataset on cervical spine. We hypothesize that the tasks are semantically close and related and thus multitask learners are powerful classifiers. Our work opens the scope of using our method to radiologist's reports on various body parts.

Applying machine learning to predict behavior of bus transport in Warsaw, Poland arxiv:2204.04515 📈 4

Łukasz Pałys, Maria Ganzha, Marcin Paprzycki

**Abstract:** Nowadays, it is possible to collect precise data describing movements of public transport. Specifically, for each bus (or tram) geoposition data can be regularly collected. This includes data for all buses in Warsaw, Poland. Moreover, this data can be downloaded and analyzed. In this context, one of the simplest questions is: can a model be build to represent behavior of busses, and predict their delays. This work provides initial results of our attempt to answer this question.

Trajectory Optimization Using Neural Network Gradients of Learned Dynamics arxiv:2204.04558 📈 3

Nathanael Köhler, Bhavya Sukhija, Miguel Zamora, Simon Zimmermann, Stelian Coros

**Abstract:** Trajectory optimization methods have achieved an exceptional level of performance on real-world robots in recent years. These methods heavily rely on accurate physics simulators, yet some aspects of the physical world, such as friction, can only be captured to a limited extent by most simulators. The goal of this paper is to leverage trajectory optimization for performing highly dynamic and complex tasks with robotic systems in absence of an accurate physics simulator. This is achieved by applying machine learning techniques to learn a differentiable dynamics model of the system from data. On the example of a RC car, we show that from data collected in only 15 minutes of human-operated interactions with the car, a neural network is able to model highly nonlinear behaviors such as loss of traction and drifting. Furthermore, we use the analytical gradients of the neural network to perform gradient-based trajectory optimization, both in an offline and online setting. We find that our learned model is able to represent complex physical behavior, like drifting and gives unprecedented performance in combination with trajectory optimization methods.

Efficient Representation Learning of Subgraphs by Subgraph-To-Node Translation arxiv:2204.04510 📈 3

Dongkwan Kim, Alice Oh

**Abstract:** A subgraph is a data structure that can represent various real-world problems. We propose Subgraph-To-Node (S2N) translation, which is a novel formulation to efficiently learn representations of subgraphs. Specifically, given a set of subgraphs in the global graph, we construct a new graph by coarsely transforming subgraphs into nodes. We perform subgraph-level tasks as node-level tasks through this translation. By doing so, we can significantly reduce the memory and computational costs in both training and inference. We conduct experiments on four real-world datasets to evaluate performance and efficiency. Our experiments demonstrate that models with S2N translation are more efficient than state-of-the-art models without substantial performance decrease.

Noise-based Enhancement for Foveated Rendering arxiv:2204.04455 📈 3

Taimoor Tariq, Cara Tursun, Piotr Didyk

**Abstract:** Human visual sensitivity to spatial details declines towards the periphery. Novel image synthesis techniques, so-called foveated rendering, exploit this observation and reduce the spatial resolution of synthesized images for the periphery, avoiding the synthesis of high-spatial-frequency details that are costly to generate but not perceived by a viewer. However, contemporary techniques do not make a clear distinction between the range of spatial frequencies that must be reproduced and those that can be omitted. For a given eccentricity, there is a range of frequencies that are detectable but not resolvable. While the accurate reproduction of these frequencies is not required, an observer can detect their absence if completely omitted. We use this observation to improve the performance of existing foveated rendering techniques. We demonstrate that this specific range of frequencies can be efficiently replaced with procedural noise whose parameters are carefully tuned to image content and human perception. Consequently, these frequencies do not have to be synthesized during rendering, allowing more aggressive foveation, and they can be replaced by noise generated in a less expensive post-processing step, leading to improved performance of the rendering system. Our main contribution is a perceptually-inspired technique for deriving the parameters of the noise required for the enhancement and its calibration. The method operates on rendering output and runs at rates exceeding 200FPS at 4K resolution, making it suitable for integration with real-time foveated rendering systems for VR and AR devices. We validate our results and compare them to the existing contrast enhancement technique in user experiments.

Unbiased Directed Object Attention Graph for Object Navigation arxiv:2204.04421 📈 3

Ronghao Dang, Zhuofan Shi, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

**Abstract:** Object navigation tasks require agents to locate specific objects in unknown environments based on visual information. Previously, graph convolutions were used to implicitly explore the relationships between objects. However, due to differences in visibility among objects, it is easy to generate biases in object attention. Thus, in this paper, we propose a directed object attention (DOA) graph to guide the agent in explicitly learning the attention relationships between objects, thereby reducing the object attention bias. In particular, we use the DOA graph to perform unbiased adaptive object attention (UAOA) on the object features and unbiased adaptive image attention (UAIA) on the raw images, respectively. To distinguish features in different branches, a concise adaptive branch energy distribution (ABED) method is proposed. We assess our methods on the AI2-Thor dataset. Compared with the state-of-the-art (SOTA) method, our method reports 7.4%, 8.1% and 17.6% increase in success rate (SR), success weighted by path length (SPL) and success weighted by action efficiency (SAE), respectively.

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization arxiv:2204.04413 📈 3

Xiaochen Liu, Yu Bai, Jiawei Li, Yinan Hu, Yang Gao

**Abstract:** Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we designed a novel soft prompts architecture coupled with a prompt pre-training plus fine-tuning paradigm that is effective and tunes only extremely light parameters. The soft prompts include continuous input embeddings across an encoder and a decoder to fit the structure of the generation models. Importantly, a novel inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. The first step in the summarization procedure is to conduct prompt pre-training with self-supervised pseudo-data. This teaches the model basic summarizing capabilities. The model is then fine-tuned with few-shot examples. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

Private Sequential Hypothesis Testing for Statisticians: Privacy, Error Rates, and Sample Size arxiv:2204.04597 📈 2

Wanrong Zhang, Yajun Mei, Rachel Cummings

**Abstract:** The sequential hypothesis testing problem is a class of statistical analyses where the sample size is not fixed in advance. Instead, the decision-process takes in new observations sequentially to make real-time decisions for testing an alternative hypothesis against a null hypothesis until some stopping criterion is satisfied. In many common applications of sequential hypothesis testing, the data can be highly sensitive and may require privacy protection; for example, sequential hypothesis testing is used in clinical trials, where doctors sequentially collect data from patients and must determine when to stop recruiting patients and whether the treatment is effective. The field of differential privacy has been developed to offer data analysis tools with strong privacy guarantees, and has been commonly applied to machine learning and statistical tasks. In this work, we study the sequential hypothesis testing problem under a slight variant of differential privacy, known as Renyi differential privacy. We present a new private algorithm based on Wald's Sequential Probability Ratio Test (SPRT) that also gives strong theoretical privacy guarantees. We provide theoretical analysis on statistical performance measured by Type I and Type II error as well as the expected sample size. We also empirically validate our theoretical results on several synthetic databases, showing that our algorithms also perform well in practice. Unlike previous work in private hypothesis testing that focused only on the classical fixed sample setting, our results in the sequential setting allow a conclusion to be reached much earlier, and thus saving the cost of collecting additional samples.

Knowledge-Free Black-Box Watermark and Ownership Proof for Image Classification Neural Networks arxiv:2204.04522 📈 2

Fangqi Li, Shilin Wang

**Abstract:** Watermarking has become a plausible candidate for ownership verification and intellectual property protection of deep neural networks. Regarding image classification neural networks, current watermarking schemes uniformly resort to backdoor triggers. However, injecting a backdoor into a neural network requires knowledge of the training dataset, which is usually unavailable in the real-world commercialization. Meanwhile, established watermarking schemes oversight the potential damage of exposed evidence during ownership verification and the watermarking algorithms themselves. Those concerns decline current watermarking schemes from industrial applications. To confront these challenges, we propose a knowledge-free black-box watermarking scheme for image classification neural networks. The image generator obtained from a data-free distillation process is leveraged to stabilize the network's performance during the backdoor injection. A delicate encoding and verification protocol is designed to ensure the scheme's security against knowledgable adversaries. We also give a pioneering analysis of the capacity of the watermarking scheme. Experiment results proved the functionality-preserving capability and security of the proposed watermarking scheme.

Uncertainty-Informed Deep Learning Models Enable High-Confidence Predictions for Digital Histopathology arxiv:2204.04516 📈 2

James M Dolezal, Andrew Srisuwananukorn, Dmitry Karpeyev, Siddhi Ramesh, Sara Kochanny, Brittany Cody, Aaron Mansfield, Sagar Rakshit, Radhika Bansa, Melanie Bois, Aaron O Bungum, Jefree J Schulte, Everett E Vokes, Marina Chiara Garassino, Aliya N Husain, Alexander T Pearson

**Abstract:** A model's ability to express its own predictive uncertainty is an essential attribute for maintaining clinical user confidence as computational biomarkers are deployed into real-world medical settings. In the domain of cancer digital histopathology, we describe a novel, clinically-oriented approach to uncertainty quantification (UQ) for whole-slide images, estimating uncertainty using dropout and calculating thresholds on training data to establish cutoffs for low- and high-confidence predictions. We train models to identify lung adenocarcinoma vs. squamous cell carcinoma and show that high-confidence predictions outperform predictions without UQ, in both cross-validation and testing on two large external datasets spanning multiple institutions. Our testing strategy closely approximates real-world application, with predictions generated on unsupervised, unannotated slides using predetermined thresholds. Furthermore, we show that UQ thresholding remains reliable in the setting of domain shift, with accurate high-confidence predictions of adenocarcinoma vs. squamous cell carcinoma for out-of-distribution, non-lung cancer cohorts.

Explain yourself! Effects of Explanations in Human-Robot Interaction arxiv:2204.04501 📈 2

Jakob Ambsdorf, Alina Munir, Yiyao Wei, Klaas Degkwitz, Harm Matthias Harms, Susanne Stannek, Kyra Ahrens, Dennis Becker, Erik Strahl, Tom Weber, Stefan Wermter

**Abstract:** Recent developments in explainable artificial intelligence promise the potential to transform human-robot interaction: Explanations of robot decisions could affect user perceptions, justify their reliability, and increase trust. However, the effects on human perceptions of robots that explain their decisions have not been studied thoroughly. To analyze the effect of explainable robots, we conduct a study in which two simulated robots play a competitive board game. While one robot explains its moves, the other robot only announces them. Providing explanations for its actions was not sufficient to change the perceived competence, intelligence, likeability or safety ratings of the robot. However, the results show that the robot that explains its moves is perceived as more lively and human-like. This study demonstrates the need for and potential of explainable human-robot interaction and the wider assessment of its effects as a novel research direction.

Uninformative Input Features and Counterfactual Invariance: Two Perspectives on Spurious Correlations in Natural Language arxiv:2204.04487 📈 2

Jacob Eisenstein

**Abstract:** Spurious correlations are a threat to the trustworthiness of natural language processing systems, motivating research into methods for identifying and eliminating them. Gardner et al (2021) argue that the compositional nature of language implies that \emph{all} correlations between labels and individual input features are spurious. This paper analyzes this proposal in the context of a toy example, demonstrating three distinct conditions that can give rise to feature-label correlations in a simple PCFG. Linking the toy example to a structured causal model shows that (1) feature-label correlations can arise even when the label is invariant to interventions on the feature, and (2) feature-label correlations may be absent even when the label is sensitive to interventions on the feature. Because input features will be individually correlated with labels in all but very rare circumstances, domain knowledge must be applied to identify spurious correlations that pose genuine robustness threats.

FoundationLayerNorm: Scaling BERT and GPT to 1,000 Layers arxiv:2204.04477 📈 2

Dezhou Shen

**Abstract:** The mainstream BERT/GPT model contains only 10 to 20 layers, and there is little literature to discuss the training of deep BERT/GPT. This paper proposes a simple yet effective method to stabilize BERT and GPT training. We successfully scale up BERT and GPT to 1,000 layers, which is an order of magnitude deeper than previous BERT and GPT. The proposed method FoundationLayerNormalization enables efficient training of deep neural networks and is validated at the 1000-layer scale.

High-dimensional Asymptotics of Langevin Dynamics in Spiked Matrix Models arxiv:2204.04476 📈 2

Tengyuan Liang, Subhabrata Sen, Pragya Sur

**Abstract:** We study Langevin dynamics for recovering the planted signal in the spiked matrix model. We provide a "path-wise" characterization of the overlap between the output of the Langevin algorithm and the planted signal. This overlap is characterized in terms of a self-consistent system of integro-differential equations, usually referred to as the Crisanti-Horner-Sommers-Cugliandolo-Kurchan (CHSCK) equations in the spin glass literature. As a second contribution, we derive an explicit formula for the limiting overlap in terms of the signal-to-noise ratio and the injected noise in the diffusion. This uncovers a sharp phase transition -- in one regime, the limiting overlap is strictly positive, while in the other, the injected noise overcomes the signal, and the limiting overlap is zero.

Ultrasound Signal Processing: From Models to Deep Learning arxiv:2204.04466 📈 2

Ben Luijten, Nishith Chennakeshava, Yonina C. Eldar, Massimo Mischi, Ruud J. G. van Sloun

**Abstract:** Medical ultrasound imaging relies heavily on high-quality signal processing algorithms to provide reliable and interpretable image reconstructions. Hand-crafted reconstruction methods, often based on approximations of the underlying measurement model, are useful in practice, but notoriously fall behind in terms of image quality. More sophisticated solutions, based on statistical modelling, careful parameter tuning, or through increased model complexity, can be sensitive to different environments. Recently, deep learning based methods have gained popularity, which are optimized in a data-driven fashion. These model-agnostic methods often rely on generic model structures, and require vast training data to converge to a robust solution. A relatively new paradigm combines the power of the two: leveraging data-driven deep learning, as well as exploiting domain knowledge. These model-based solutions yield high robustness, and require less trainable parameters and training data than conventional neural networks. In this work we provide an overview of these methods from the recent literature, and discuss a wide variety of ultrasound applications. We aim to inspire the reader to further research in this area, and to address the opportunities within the field of ultrasound signal processing. We conclude with a future perspective on these model-based deep learning techniques for medical ultrasound applications.

Yes, Topology Matters in Decentralized Optimization: Refined Convergence and Topology Learning under Heterogeneous Data arxiv:2204.04452 📈 2

B. Le Bars, A. Bellet, M. Tommasi, AM. Kermarrec

**Abstract:** One of the key challenges in federated and decentralized learning is to design algorithms that efficiently deal with highly heterogeneous data distributions across agents. In this paper, we revisit the analysis of Decentralized Stochastic Gradient Descent algorithm (D-SGD), a popular decentralized learning algorithm, under data heterogeneity. We exhibit the key role played by a new quantity, that we call neighborhood heterogeneity, on the convergence rate of D-SGD. Unlike prior work, neighborhood heterogeneity is measured at the level of the neighborhood of an agent in the graph topology. By coupling the topology and the heterogeneity of the agents' distributions, our analysis sheds light on the poorly understood interplay between these two concepts in decentralized learning. We then argue that neighborhood heterogeneity provides a natural criterion to learn sparse data-dependent topologies that reduce (and can even eliminate) the otherwise detrimental effect of data heterogeneity on the convergence time of D-SGD. For the important case of classification with label skew, we formulate the problem of learning such a good topology as a tractable optimization problem that we solve with a Frank-Wolfe algorithm. Our approach provides a principled way to design a sparse topology that balances the number of iterations and the per-iteration communication costs of D-SGD under data heterogeneity.

An Introductory Review of Spiking Neural Network and Artificial Neural Network: From Biological Intelligence to Artificial Intelligence arxiv:2204.07519 📈 1

Shengjie Zheng, Lang Qian, Pingsheng Li, Chenggang He, Xiaoqin Qin, Xiaojian Li

**Abstract:** Recently, stemming from the rapid development of artificial intelligence, which has gained expansive success in pattern recognition, robotics, and bioinformatics, neuroscience is also gaining tremendous progress. A kind of spiking neural network with biological interpretability is gradually receiving wide attention, and this kind of neural network is also regarded as one of the directions toward general artificial intelligence. This review introduces the following sections, the biological background of spiking neurons and the theoretical basis, different neuronal models, the connectivity of neural circuits, the mainstream neural network learning mechanisms and network architectures, etc. This review hopes to attract different researchers and advance the development of brain-inspired intelligence and artificial intelligence.

Real order total variation with applications to the loss functions in learning schemes arxiv:2204.04582 📈 1

Pan Liu, Xin Yang Lu, Kunlun He

**Abstract:** Loss function are an essential part in modern data-driven approach, such as bi-level training scheme and machine learnings. In this paper we propose a loss function consisting of a $r$-order (an)-isotropic total variation semi-norms $TV^r$, $r\in \mathbb{R}^+$, defined via the Riemann-Liouville (R-L) fractional derivative. We focus on studying key theoretical properties, such as the lower semi-continuity and compactness with respect to both the function and the order of derivative $r$, of such loss functions.

Efficient Reconstruction of Stochastic Pedigrees: Some Steps From Theory to Practice arxiv:2204.04573 📈 1

Elchanan Mossel, David Vulakh

**Abstract:** In an extant population, how much information do extant individuals provide on the pedigree of their ancestors? Recent work by Kim, Mossel, Ramnarayan and Turner (2020) studied this question under a number of simplifying assumptions, including random mating, fixed length inheritance blocks and sufficiently large founding population. They showed that under these conditions if the average number of offspring is a sufficiently large constant, then it is possible to recover a large fraction of the pedigree structure and genetic content by an algorithm they named REC-GEN. We are interested in studying the performance of REC-GEN on simulated data generated according to the model. As a first step, we improve the running time of the algorithm. However, we observe that even the faster version of the algorithm does not do well in any simulations in recovering the pedigree beyond 2 generations. We claim that this is due to the inbreeding present in any setting where the algorithm can be run, even on simulated data. To support the claim we show that a main step of the algorithm, called ancestral reconstruction, performs accurately in a idealized setting with no inbreeding but performs poorly in random mating populations. To overcome the poor behavior of REC-GEN we introduce a Belief-Propagation based heuristic that accounts for the inbreeding and performs much better in our simulations.

Spectral bounds of the $\varepsilon$-entropy of kernel classes arxiv:2204.04512 📈 0

Rustem Takhanov

**Abstract:** We develop new upper and lower bounds on the $\varepsilon$-entropy of a unit ball in a reproducing kernel Hilbert space induced by some Mercer kernel $K$. Our bounds are based on the behaviour of eigenvalues of a corresponding integral operator. In our approach we exploit an ellipsoidal structure of a unit ball in RKHS and a previous work on covering numbers of an ellipsoid in the euclidean space obtained by Dumer, Pinsker and Prelov. We present a number of applications of our main bound, such as its tightness for a practically important case of the Gaussian kernel. Further, we develop a series of lower bounds on the $\varepsilon$-entropy that can be established from a connection between covering numbers of a ball in RKHS and a quantization of a Gaussian Random Field that corresponds to the kernel $K$ by the Kosambi-Karhunen-Loève transform.

Prev: 2022.04.08 Next: 2022.04.10