Progress in Automated Handwriting Recognition has been hampered by the lack of large training datasets. Nearly all research uses a set of small datasets that often cause models to overfit. We present CENSUS-HWR, a new dataset consisting of full English handwritten words in 1,812,014 gray scale images. A total of 1,865,134 handwritten texts from a vocabulary of 10,711 words in the English language are present in this collection. This dataset is intended to serve handwriting models as a benchmark for deep learning algorithms. This huge English handwriting recognition dataset has been extracted from the US 1930 and 1940 censuses taken by approximately 70,000 enumerators each year. The dataset and the trained model with their weights are freely available to download at this https URL.
The Transformer architecture is shown to provide a powerful machine transduction framework for online handwritten gestures corresponding to glyph strokes of natural language sentences. The attention mechanism is successfully used to create latent representations of an end-to-end encoder-decoder model, solving multi-level segmentation while also learning some language features and syntax rules. The additional use of a large decoding space with some learned Byte-Pair-Encoding (BPE) is shown to provide robustness to ablated inputs and syntax rules. The encoder stack was directly fed with spatio-temporal data tokens potentially forming an infinitely large input vocabulary, an approach that finds applications beyond that of this work. Encoder transfer learning capabilities is also demonstrated on several languages resulting in faster optimisation and shared parameters. A new supervised dataset of online handwriting gestures suitable for generic handwriting recognition tasks was used to successfully train a small transformer model to an average normalised Levenshtein accuracy of 96% on English or German sentences and 94% in French.
Transformer架构提供了一种强大的机器翻译框架，以对应自然语言句子glyph strokes的在线手写手势。注意力机制成功地被用于创建端到端编码解码模型的隐态表示，同时解决多层次分割，同时也学习了一些语言特征和语法规则。此外，使用一些已学习的字节对编码(BPE)的大解码空间提供了对 ablated 输入和语法规则的鲁棒性。编码器栈直接接收时空数据 token，可能形成无限大的输入词汇，这种方法超越了本工作的应用。Encoder 迁移学习能力也被在多个语言中演示，导致更快的优化和共享参数。了一个新的适合通用手写识别任务的在线手写手势监督数据集被使用，成功地训练了一个小型Transformer模型，使其在英语或德语句子上的平均 normalised 拼写错误率为96%，而在法语中的为94%。
Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on both modern and historical manuscripts in large benchmark datasets. Nonetheless, those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting. This issue is very relevant for valuable but small collections of documents preserved in historical archives, for which obtaining sufficient annotated training data is costly or, in some cases, unfeasible. To overcome this challenge, a possible solution is to pretrain HTR models on large datasets and then fine-tune them on small single-author collections. In this paper, we take into account large, real benchmark datasets and synthetic ones obtained with a styled Handwritten Text Generation model. Through extensive experimental analysis, also considering the amount of fine-tuning lines, we give a quantitative indication of the most relevant characteristics of such data for obtaining an HTR model able to effectively transcribe manuscripts in small collections with as little as five real fine-tuning lines.
近年来，深度学习基于手写文本识别(HTR)的发展已经取得了巨大的进步，在大型基准数据集上取得了卓越的现代和历史手稿识别性能。然而，当应用于具有独特特征的手稿，例如语言、纸张支持、墨水和作者手写字体时，这些模型却无法达到相同的表现水平。这个问题对于保存的宝贵但小型文档集合，获取足够的注释训练数据非常昂贵或在某些情况下不可能。为了克服这一挑战，一种可能的解决方案是在大型基准数据集上预训练HTR模型，然后在小 single-author 集合上微调它们。在本文中，我们考虑了大型真实基准数据集和通过手写文本生成模型样式生成的模拟数据集。通过广泛的实验分析，并考虑微调线的数量和精度，我们提供了这种数据的最相关特征的定量指示，以构建能够在小数据集上有效解码手稿的HTR模型，仅包含五行真正的微调线。
We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.
我们提出了从数字化手写文档中提取信息的Transformer-based方法。我们的方法将 separate 模型此前完成的不同步骤整合到一个模型中：特征提取，手写命名实体识别和两阶段方法(手写命名实体识别之前)。我们将这种集成方法与传统的二阶段方法进行比较，在手写命名实体识别之前进行特征提取，并在不同水平上呈现结果：行、段落和页面。我们的实验结果表明，当应用于整页时，注意力模型特别有趣，因为它们不需要任何前分片步骤。最后，我们展示了它们能够从关键值注释学习：一个重要的单词及其相应的命名实体列表。我们在三个公共数据库(IAM、ESPOSALLES和POPP)上比较了我们的模型与最先进的方法，并在所有三个数据集上优于以前的性能表现。
Generative modelling over continuous-time geometric constructs, a.k.a such as handwriting, sketches, drawings etc., have been accomplished through autoregressive distributions. Such strictly-ordered discrete factorization however falls short of capturing key properties of chirographic data -- it fails to build holistic understanding of the temporal concept due to one-way visibility (causality). Consequently, temporal data has been modelled as discrete token sequences of fixed sampling rate instead of capturing the true underlying concept. In this paper, we introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data that specifically addresses these flaws. Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate up to a good extent. Moreover, we show that many important downstream utilities (e.g. conditional sampling, creative mixing) can be flexibly implemented using ChiroDiff. We further show some unique use-cases like stochastic vectorization, de-noising/healing, abstraction are also possible with this model-class. We perform quantitative and qualitative evaluation of our framework on relevant datasets and found it to be better or on par with competing approaches.
连续性时间几何构造的生成建模，例如手写、 Sketch、绘图等，是通过自回归分布完成的。然而，这种严格排序的离散 factorization 未能捕捉到的图像数据的关键特性——由于单向可视化(因果性)，它无法构建整体理解，因此，时间数据被建模为固定采样率的离散 token 序列，而不是捕捉真正 underlying 概念。在本文中，我们介绍了一种强大的模型类别，名为“去噪扩散概率模型”(DDPM)，专门解决了图像数据的这些缺陷。我们的模型名为“ChiroDiff”，它不是自回归的，因此学习捕捉整体概念，因此可以很好地适应更高的时间采样率。此外，我们展示了许多重要的后续功能(例如条件采样、创造性混合)可以使用 ChiroDiff 灵活实现。我们还展示了一些独特的应用场景，例如随机向量化、去噪/修复、抽象等。我们对这些模型类别在相关数据集上进行定量和定性评估，并发现它更好或与竞争方法相当。
Planning from demonstrations has shown promising results with the advances of deep neural networks. One of the most popular real-world applications is automated handwriting using a robotic manipulator. Classically it is simplified as a two-dimension problem. This representation is suitable for elementary drawings, but it is not sufficient for Japanese calligraphy or complex work of art where the orientation of a pen is part of the user expression. In this study, we focus on automated planning of Japanese calligraphy using a three-dimension representation of the trajectory as well as the rotation of the pen tip, and propose a novel deep imitation learning neural network that learns from expert demonstrations through a combination of images and pose data. The network consists of a combination of variational auto-encoder, bi-directional LSTM, and Multi-Layer Perceptron (MLP). Experiments are conducted in a progressive way, and results demonstrate that the proposed approach is successful in completion of tasks for real-world robots, overcoming the distribution shift problem in imitation learning. The source code and dataset will be public.
在深度学习网络的进步下，从演示中规划取得了令人瞩目的成果。最受欢迎的实际应用场景之一是使用机器人操纵器自动化手写体。传统上，它被简化为两个维度的问题。这种表示适合基本的绘画，但对于日本书法或复杂的艺术作品，笔的方向是用户表达的一部分，不适合使用。在这个研究中，我们关注用三维路径和笔尖旋转表示日本书法的自动化规划，并提出了一种新的深度学习模仿学习神经网络，它通过图像和姿态数据的组合学习专家演示。网络由Variational 编码器、双向LSTM和多层感知器(MLP)的组合组成。实验以逐步的方式进行，结果证明，该 proposed 方法成功地完成了针对现实世界机器人的任务，克服了模仿学习的分布 shift 问题。源代码和数据集将公开。
In this work, we explore massive pre-training on synthetic word images for enhancing the performance on four benchmark downstream handwriting analysis tasks. To this end, we build a large synthetic dataset of word images rendered in several handwriting fonts, which offers a complete supervision signal. We use it to train a simple convolutional neural network (ConvNet) with a fully supervised objective. The vector representations of the images obtained from the pre-trained ConvNet can then be considered as encodings of the handwriting style. We exploit such representations for Writer Retrieval, Writer Identification, Writer Verification, and Writer Classification and demonstrate that our pre-training strategy allows extracting rich representations of the writers' style that enable the aforementioned tasks with competitive results with respect to task-specific State-of-the-Art approaches.
Training machines to synthesize diverse handwritings is an intriguing task. Recently, RNN-based methods have been proposed to generate stylized online Chinese characters. However, these methods mainly focus on capturing a person's overall writing style, neglecting subtle style inconsistencies between characters written by the same person. For example, while a person's handwriting typically exhibits general uniformity (e.g., glyph slant and aspect ratios), there are still small style variations in finer details (e.g., stroke length and curvature) of characters. In light of this, we propose to disentangle the style representations at both writer and character levels from individual handwritings to synthesize realistic stylized online handwritten characters. Specifically, we present the style-disentangled Transformer (SDT), which employs two complementary contrastive objectives to extract the style commonalities of reference samples and capture the detailed style patterns of each sample, respectively. Extensive experiments on various language scripts demonstrate the effectiveness of SDT. Notably, our empirical findings reveal that the two learned style representations provide information at different frequency magnitudes, underscoring the importance of separate style extraction. Our source code is public at: this https URL.
训练机器生成多样化的手写字体是一个有趣的任务。最近，基于RNN的方法被提出用于生成精心修饰的在线中文字符。但是这些方法主要关注捕捉一个人的整体写作风格，忽略了相同作者笔下字符的微妙风格一致性。例如，尽管一个人的手写字体通常具有普遍一致性(例如字斜度和比例)，但字符的 fine grained 风格差异仍然存在。鉴于这一点，我们建议从作者和字符级别的风格表示中分离出来，以合成实际修饰的在线手写字符。具体来说，我们提出了风格分离的Transformer(SDT)，它使用两个互补的对比度目标分别提取参考样本的风格一致性，并捕捉每个样本的详细风格模式。针对各种语言脚本，进行了广泛的实验，证明了SDT的有效性。特别值得一提的是，我们的实证研究表明，两种学习的风格表示提供不同的频率幅度信息，突出了分开风格提取的重要性。我们的源代码公开在以下httpsURL。
The Transformer has quickly become the dominant architecture for various pattern recognition tasks due to its capacity for long-range representation. However, transformers are data-hungry models and need large datasets for training. In Handwritten Text Recognition (HTR), collecting a massive amount of labeled data is a complicated and expensive task. In this paper, we propose a lite transformer architecture for full-page multi-script handwriting recognition. The proposed model comes with three advantages: First, to solve the common problem of data scarcity, we propose a lite transformer model that can be trained on a reasonable amount of data, which is the case of most HTR public datasets, without the need for external data. Second, it can learn the reading order at page-level thanks to a curriculum learning strategy, allowing it to avoid line segmentation errors, exploit a larger context and reduce the need for costly segmentation annotations. Third, it can be easily adapted to other scripts by applying a simple transfer-learning process using only page-level labeled images. Extensive experiments on different datasets with different scripts (French, English, Spanish, and Arabic) show the effectiveness of the proposed model.
Transformer 已经迅速成为各种模式识别任务的主要架构，因为它能够进行远程表示。然而，Transformer 是一个数据饥渴模型，需要大规模的数据进行训练。在手写文本识别(HTR)中，收集大量标记数据是一项复杂且昂贵的任务。在本文中，我们提出了一种简单的Transformer架构，用于全页多脚本手写文本识别。该模型有三项优点：第一，为了解决数据稀缺的共同问题，我们提出了一种简单的Transformer模型，可以在合理的数据量上进行训练，这是HTR公共数据集的一般情况，无需外部数据。第二，它可以通过课程学习策略在页面级别学习阅读顺序，避免线分割错误，利用更大的上下文，减少昂贵的分割注释需求。第三，它可以轻松适应其他脚本，通过仅使用页面级别标记图像的应用简单的迁移学习过程。对不同脚本不同数据集(法语、英语、西班牙语和阿拉伯语)进行广泛的实验表明，该模型的有效性。
This study investigates the effect of haptic control strategies on a subject's mental engagement during a fine motor handwriting rehabilitation task. The considered control strategies include an error-reduction (ER) and an error-augmentation (EA), which are tested on both dominant and non-dominant hand. A non-invasive brain-computer interface is used to monitor the electroencephalogram (EEG) activities of the subjects and evaluate the subject's mental engagement using the power of multiple frequency bands (theta, alpha, and beta). Statistical analysis of the effect of the control strategy on mental engagement revealed that the choice of the haptic control strategy has a significant effect (p < 0.001) on mental engagement depending on the type of hand (dominant or non-dominant). Among the evaluated strategies, EA is shown to be more mentally engaging when compared with the ER under the non-dominant hand.
这项研究研究了在精细运动手语康复任务中, haptic控制策略对Subject mental engagement的影响。被考虑的控制策略包括一个减少错误(ER)和一个增加错误(EA),这两种策略都在 dominant 和 non- dominant 手分别进行测试。一个非侵入性的脑机接口被用来监测Subject的EEG活动,并使用多个频率Band的能量评估Subject的 mental engagement。对控制策略对 mental engagement 的影响进行统计分析表明,根据手的类型( dominant 或 non- dominant),选择 haptic 控制策略具有显著影响(p < 0.001)。在评估的策略中,EA 在 non- dominant 手下比 ER 更具有心理 engagement。
One of the factors limiting the performance of handwritten text recognition (HTR) for stenography is the small amount of annotated training data. To alleviate the problem of data scarcity, modern HTR methods often employ data augmentation. However, due to specifics of the stenographic script, such settings may not be directly applicable for stenography recognition. In this work, we study 22 classical augmentation techniques, most of which are commonly used for HTR of other scripts, such as Latin handwriting. Through extensive experiments, we identify a group of augmentations, including for example contained ranges of random rotation, shifts and scaling, that are beneficial to the use case of stenography recognition. Furthermore, a number of augmentation approaches, leading to a decrease in recognition performance, are identified. Our results are supported by statistical hypothesis testing. Links to the publicly available dataset and codebase are provided.
Deformable linear objects, such as rods, cables, and ropes, play important roles in daily life. However, manipulation of DLOs is challenging as large geometrically nonlinear deformations may occur during the manipulation process. This problem is made even more difficult as the different deformation modes (e.g., stretching, bending, and twisting) may result in elastic instabilities during manipulation. In this paper, we formulate a physics-guided data-driven method to solve a challenging manipulation task -- accurately deploying a DLO (an elastic rod) onto a rigid substrate along various prescribed patterns. Our framework combines machine learning, scaling analysis, and physics-based simulations to develop a physically informed neural controller for deployment. We explore the complex interplay between the gravitational and elastic energies of the manipulated DLO and obtain a control method for DLO deployment that is robust against friction and material properties. Out of the numerous geometrical and material properties of the rod and substrate, we show that only three non-dimensional parameters are needed to describe the deployment process with physical analysis. Therefore, the essence of the controlling law for the manipulation task can be constructed with a low-dimensional model, drastically increasing the computation speed. The effectiveness of our optimal control scheme is shown through a comprehensive robotic case study comparing against a heuristic control method for deploying rods for a wide variety of patterns. In addition to this, we also showcase the practicality of our control scheme by having a robot accomplish challenging high-level tasks such as mimicking human handwriting and tying knots.
弯曲线性物体,如棒、电缆和绳索,在日常生活中扮演重要角色。然而,处理DLOs是一项挑战性的任务,因为它们在操作过程中可能会出现大的几何非线性变形。这个问题变得更加困难,因为不同的变形模式(例如拉伸、弯曲和扭曲)可能在操作过程中导致弹性不稳定。在本文中,我们提出了一种基于物理学指导的数据驱动方法来解决一个挑战性的操作任务——准确地部署DLO(弹性棒)到坚硬的表面上,按照各种给定的模式。我们的框架结合了机器学习、指数级分析和物理学模拟,开发了一款物理 informed 神经网络控制器,以部署DLO。我们探索了操纵DLO的重力和弹性能量之间的复杂交互作用,并获得了一种能够在摩擦和材料特性方面保持稳定的控制方法。从棒和表面上的许多几何和材料特性中,我们表明只需要三个非三维参数才能用物理分析来描述部署过程。因此,操纵任务的控制法的核心可以构建在一个低维模型中,显著加快计算速度。我们的最佳控制方案的效果通过一个全面的机器人案例研究对比了一种新型的启发式控制方法,以部署棒的各种模式。此外,我们还展示了我们的控制方案的实际可行性,通过让机器人完成诸如模仿人类手写体和解开结等具有挑战性的高层次任务。
In this paper, we present AR3n (pronounced as Aaron), an assist-as-needed (AAN) controller that utilizes reinforcement learning to supply adaptive assistance during a robot assisted handwriting rehabilitation task. Unlike previous AAN controllers, our method does not rely on patient specific controller parameters or physical models. We propose the use of a virtual patient model to generalize AR3n across multiple subjects. The system modulates robotic assistance in realtime based on a subject's tracking error, while minimizing the amount of robotic assistance. The controller is experimentally validated through a set of simulations and human subject experiments. Finally, a comparative study with a traditional rule-based controller is conducted to analyze differences in assistance mechanisms of the two controllers.
One of the challenges of handwriting recognition is to transcribe a large number of vastly different writing styles. State-of-the-art approaches do not explicitly use information about the writer's style, which may be limiting overall accuracy due to various ambiguities. We explore models with writer-dependent parameters which take the writer's identity as an additional input. The proposed models can be trained on datasets with partitions likely written by a single author (e.g. single letter, diary, or chronicle). We propose a Writer Style Block (WSB), an adaptive instance normalization layer conditioned on learned embeddings of the partitions. We experimented with various placements and settings of WSB and contrastively pre-trained embeddings. We show that our approach outperforms a baseline with no WSB in a writer-dependent scenario and that it is possible to estimate embeddings for new writers. However, domain adaptation using simple finetuning in a writer-independent setting provides superior accuracy at a similar computational cost. The proposed approach should be further investigated in terms of training stability and embedding regularization to overcome such a baseline.
手写字符识别中的一个重要挑战是准确地记录大量不同的写作风格。目前最先进的方法并没有明确地使用作家的风格信息,这可能因为各种歧义而导致整体准确性的限制。我们探索了具有作家依赖性参数的方法,这些参数将作家身份作为额外的输入。我们希望将这些模型应用于可能由单个作者编写的分区数据集上(例如单个字母、日记或 chronicle)。我们提出了作家风格块(WSB),它是一种自适应实例归一化层,基于分区学习的嵌入。我们尝试了各种 WSB 的位置和设置,并对比性的预先训练嵌入进行了实验。我们表明,在我们的方法中,在没有 WSB 的情况下在作家依赖性场景中比基准模型表现更好,并且可以估计新作家的嵌入。然而,在作家独立性场景中,通过简单的微调提供比基准模型更好的准确性。因此,我们建议进一步研究训练稳定性和嵌入正则化,以克服这种基准。
In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple finetuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of finetuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, finetuning provided an average relative CER improvement of 25 % with 16 text lines for new writers and 50 % for 256 text lines.
Text-writing robots have been used in assistive writing and drawing applications. However, robots do not convey emotional tones in the writing process due to the lack of behaviors humans typically adopt. To examine how people interpret designed robotic expressions of emotion through both movements and textual output, we used a pen-plotting robot to generate texts by performing human-like behaviors like stop-and-go, speed, and pressure variation. We examined how people convey emotion in the writing process by observing how they wrote in different emotional contexts. We then mapped these human expressions during writing to the handwriting robot and measured how well other participants understood the robot's affective expression. We found that textual output was the strongest determinant of participants' ability to perceive the robot's emotions, whereas parameters of gestural movements of the robots like speed, fluency, pressure, size, and acceleration could be useful for understanding the context of the writing expression.
The events of the past 2 years related to the pandemic have shown that it is increasingly important to find new tools to help mental health experts in diagnosing mood disorders. Leaving aside the longcovid cognitive (e.g., difficulty in concentration) and bodily (e.g., loss of smell) effects, the short-term covid effects on mental health were a significant increase in anxiety and depressive symptoms. The aim of this study is to use a new tool, the online handwriting and drawing analysis, to discriminate between healthy individuals and depressed patients. To this aim, patients with clinical depression (n = 14), individuals with high sub-clinical (diagnosed by a test rather than a doctor) depressive traits (n = 15) and healthy individuals (n = 20) were recruited and asked to perform four online drawing /handwriting tasks using a digitizing tablet and a special writing device. From the raw collected online data, seventeen drawing/writing features (categorized into five categories) were extracted, and compared among the three groups of the involved participants, through ANOVA repeated measures analyses. Results shows that Time features are more effective in discriminating between healthy and participants with sub-clinical depressive characteristics. On the other hand, Ductus and Pressure features are more effective in discriminating between clinical depressed and healthy participants.
We present a generative document-specific approach to character analysis and recognition in text lines. Our main idea is to build on unsupervised multi-object segmentation methods and in particular those that reconstruct images based on a limited amount of visual elements, called sprites. Our approach can learn a large number of different characters and leverage line-level annotations when available. Our contribution is twofold. First, we provide the first adaptation and evaluation of a deep unsupervised multi-object segmentation approach for text line analysis. Since these methods have mainly been evaluated on synthetic data in a completely unsupervised setting, demonstrating that they can be adapted and quantitatively evaluated on real text images and that they can be trained using weak supervision are significant progresses. Second, we demonstrate the potential of our method for new applications, more specifically in the field of paleography, which studies the history and variations of handwriting, and for cipher analysis. We evaluate our approach on three very different datasets: a printed volume of the Google1000 dataset, the Copiale cipher and historical handwritten charters from the 12th and early 13th century.
This paper presents the design and implementation of WhisperWand, a comprehensive voice and motion tracking interface for voice assistants. Distinct from prior works, WhisperWand is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. WhisperWand prototype achieves 73 um of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%.
本文介绍了WhisperWand的设计和实现,这是一种 comprehensive voice and motion tracking interface,用于语音助手。与以前的工作不同,WhisperWand是一种精确的跟踪接口,可以在低采样率语音助手中与语音接口共存。将手写文本作为具体应用,它可以捕捉自然笔画和个性化写作风格,同时占用只有一个频率。核心技术包括称为交叉频率连续波(CFCW)声学映射的方法,使其语音助手可以使用超声波作为距离信号,同时使用语音助手常用的麦克风系统作为接收器。我们还设计了一种新优化算法,只需要一个频率来计算到达时间差异。WhisperWand原型在1D测量中取得了73微米的平均误差,在3D跟踪语音 beacon 中使用麦克风阵列时取得了1.4毫米的平均误差。我们实现的空中手写文本接口使用自动手写文本软件实现94.1%的精度,类似于在纸上书写(96.6%)。同时,基于语音的用户身份验证错误率仅从6.26%增加到8.28%。
To this date, studies focusing on the prodromal diagnosis of Lewy body diseases (LBDs) based on quantitative analysis of graphomotor and handwriting difficulties are missing. In this work, we enrolled 18 subjects diagnosed with possible or probable mild cognitive impairment with Lewy bodies (MCI-LB), 7 subjects having more than 50% probability of developing Parkinson's disease (PD), 21 subjects with both possible/probable MCI-LB and probability of PD > 50%, and 37 age- and gender-matched healthy controls (HC). Each participant performed three tasks: Archimedean spiral drawing (to quantify graphomotor difficulties), sentence writing task (to quantify handwriting difficulties), and pentagon copying test (to quantify cognitive decline). Next, we parameterized the acquired data by various temporal, kinematic, dynamic, spatial, and task-specific features. And finally, we trained classification models for each task separately as well as a model for their combination to estimate the predictive power of the features for the identification of LBDs. Using this approach we were able to identify prodromal LBDs with 74% accuracy and showed the promising potential of computerized objective and non-invasive diagnosis of LBDs based on the assessment of graphomotor and handwriting difficulties.