Customer-obsessed science
-
July 22, 2024“Best-fit packing” adapts bin-packing to avoid unnecessary truncation of training documents, improving LLM performance across a wide range of tasks and reducing hallucination.
-
July 19, 2024In his keynote address at CVPR, Swami Sivasubramanian considers the many ways that Amazon incorporates computer vision technology into its products and makes it directly available to Amazon Web Services’ customers.
-
July 17, 2024Learning algorithms and reinforcement learning are areas of focus, while LLM-related research — on topics such as continual learning, hallucination mitigation, and privacy — remains well represented.
-
-
July 21 - 27, 2024
-
August 11 - 16, 2024
-
August 25 - 29, 2024
-
July 08, 2024
University students will compete for cash prizes in a competition to securely advance LLMs that code. To apply, submit your application before the deadline on September 1, 2024.
-
ECCV 20242024In this work, we propose an efficient Video-Language Alignment (ViLA) network. Our ViLA model addresses both efficient frame sampling and effective cross-modal alignment in a unified way. In our ViLA network, we design a new learnable text-guided Frame-Prompter together with a cross-modal distillation (QFormer-Distiller) module. Pretrained large image-language models have shown promising results on problems
-
Sixth Symposium on Advances in Approximate Bayesian Inference2024With the advances of computational power, there has been a rapid development in complex systems to predict certain outputs for industrial problems. Attributing outputs to input features, or output changes to input or system changes has been a critical and challenging problem in many real world applications. In industrial settings, a system could be a chain of large scale models or simulators, or a combination
-
2024Cross-language transfer learning from English to a target language has shown effectiveness in low-resourced audiovisual speech recognition (AV-ASR). We first investigate a 2-stage protocol, which performs fine-tuning of the English pre-trained AV encoder on a large audio corpus in the target language (1st stage), and then carries out cross-modality transfer learning from audio to AV in the target language
-
ECCV 20242024Recent advancements in Multimodal Large Language Models (MLLMs) have revolutionized the field of vision-language understanding by integrating visual perception capabilities into Large Language Models (LLMs). The prevailing trend in this field involves the utilization of a vision encoder derived from vision-language contrastive learning (CL), showing expertise in capturing overall representations while facing
-
2024Convolution augmented Transformer architectures have dominated the field of automatic speech recognition by showing better WER results when the models are trained on relatively smaller training data. In this work, we revisit the necessity of convolution modules in the ASR encoder architecture, given that the inductive bias brought by the convolution modules may only boost performance in a low training data
July 24, 2024
Program empowers uniquely merited scholars from backgrounds historically underrepresented in STEM to become industry leaders through scholarship, research, and career opportunities.
Resources
-
We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
-
We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
-
Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.