transformer search results

Self-Attention in Transformers: Computation logic and implementation pub.towardsai.net \| Yesterday	Summary: Self Attention in Transformers Computation Logic and ImplementationSelf attention untangles the relationships between tokens in deeplearningAttention serves as fundamental concept for transformer architecture and for Large Language Models, playing p... Keywords: nlp, turing, computing, generative, ios
"transformers can use meaningless filler tokens (e.g., '......') in place of a chain of thought" - Let's ... www.reddit.com \| Yesterday	Summary: From the abstract We show that transformers can use meaningless filler tokens e.g., 39 ...... 39 in place of chain of thought to solve two hard algorithmic tasks they could not solve when responding without intermediate tokens. However, we find... Keywords: transformer
Hybrid Quantum Vision Transformers: A Potential Solution for Computationally Intensive Image Classification quantumzeitgeist.com \| Yesterday	Summary: Hybrid Quantum Vision Transformers, new model based on vision transformer architectures, could potentially reduce training and operating time while maintaining predictive power. Despite their computational expense, transformer architectures have been... Keywords: transformer, machine learning, quantum comp
Towards smaller, faster decoder-only transformers: Architectural variants and their implications arxiv.org \| Yesterday	Summary: Research on Large Language Models (LLMs) has recently seen exponential growth, largely focused on transformer-based architectures, as introduced by [1] and further advanced by the decoder-only variations in [2]. Contemporary studies typically aim to improve model capabilities by increasing both the architecture's complexity and the volume of training data. However, research exploring how to reduce model sizes while maintaining performance is limited. This study introduces three modifications to ... Keywords: transformer
NLP-enabled trajectory map-matching in urban road networks using transformer sequence-to-sequence model arxiv.org \| Yesterday	Summary: Large-scale geolocation telematics data acquired from connected vehicles has the potential to significantly enhance mobility infrastructures and operational systems within smart cities. To effectively utilize this data, it is essential to accurately match the geolocation data to the road segments. However, this matching is often not trivial due to the low sampling rate and errors exacerbated by multipath effects in urban environments. Traditionally, statistical modeling techniques such as Hidden... Keywords: transformer, network, statistic, sampling
What is Retrieval Augmented Generation (RAG) in AI? Full Guide www.mltut.com \| Today	Summary: Do you want to know the What is Retrieval Augmented Generation RAG in AI ... If yes, this blog is for you. In this blog, tried to explain What is Retrieval Augmented Generation RAG in AI in the simplest way.The post What is Retrieval Augmented Ge... Keywords: gpt, artificial intelligence, natural language
Chinas Vidu Challenges Sora with High-Definition 16-Second AI Video Clips in 1080p www.marktechpost.com \| Yesterday	Summary: The 2024 Zhongguancun Forum in Beijingsaw the introduction ofVidu, an advanced AI model that can generate 16 second 1080p video clips with simple prompt.Developed by ShengShu AI and Tsinghua University, Vidu is set to compete withOpenAI 8217 sSora, ... Keywords: transformer, tpu, generative, design, visual
CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection arxiv.org \| Yesterday	Summary: Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection. However, the currently existing models generally focus on the Encoder-side Transformer to extract features, from which decoder improvement can bring further potential with well-designed architecture. We propose CFPFormer, a novel decoder block that integrates feature pyramids and transformers. Specifically, by leveraging patch embeddin... Keywords: object detection, design, transformer, network,
Python SkLearn accuracy scores for Linear Regression give nonsensical results stackoverflow.com \| Today	Summary: am doing the Kaggle House Prices competition to practice my machine learning skills. preprocess the data and then use cross validation to test couple of different models to see which one performs the best. Unfortunately, although receive normal looki... Keywords: python, regression, test, machine learning
FairGT: A Fairness-aware Graph Transformer arxiv.org \| Yesterday	Summary: The design of Graph Transformers (GTs) generally neglects considerations for fairness, resulting in biased outcomes against certain sensitive subgroups. Since GTs encode graph information without relying on message-passing mechanisms, conventional fairness-aware graph learning methods cannot be directly applicable to address these issues. To tackle this challenge, we propose FairGT, a Fairness-aware Graph Transformer explicitly crafted to mitigate fairness concerns inherent in GTs. FairGT incorp... Keywords: design, transformer
Let's Think Dot by Dot: Hidden Computation in Transformer Language Models arxiv.org \| Yesterday	Summary: Let s Think Dot by Dot Hidden Computation in Transformer Language Models... Keywords: transformer
[R][P] Predicting stochastic flows using Transformer models www.reddit.com \| Today	Summary: have fluid flow problem where the temperature, speed and the direction of the flow is to be modelled from many many instances of experiment we gathered from past 30 years of research. Although the flow follow the equations of fluid mechanics, the rea... Keywords: transformer, nlp, pre-trained, deep learning
SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision arxiv.org \| Yesterday	Summary: Selective attention helps us focus on task-relevant aspects in the constant flood of our sensory input. This constraint in our perception allows us to robustly generalize under distractions and to new compositions of perceivable concepts. Transformers employ a similar notion of attention in their architecture, but representation learning models with transformer backbones like CLIP and DINO often fail to demonstrate robustness and compositionality. We highlight a missing architectural prior: unli... Keywords: transformer, coding
*Block and AgentFormer Playing with blocks and Transformers (yay)** github.com \| Today	Summary: Block and AgentFormer Playing with blocks and Transformers yay... Keywords: transformer
Neural Proto-Language Reconstruction arxiv.org \| Yesterday	Summary: Proto-form reconstruction has been a painstaking process for linguists. Recently, computational models such as RNN and Transformers have been proposed to automate this process. We take three different approaches to improve upon previous methods, including data augmentation to recover missing reflexes, adding a VAE structure to the Transformer model for proto-to-language prediction, and using a neural machine translation model for the reconstruction task. We find that with the additional VAE stru... Keywords: transformer
Pyramid Hierarchical Transformer for Hyperspectral Image Classification arxiv.org \| Yesterday	Summary: The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). This innovative approach organizes input data hierarchically into segments, each representing distinct abstraction levels, thereby enhancing processing efficiency for lengthy sequences. At each level, a dedicated tran... Keywords: transformer, scala, classification
Transformer for Times Series: an Application to the S&P500 arxiv.org \| Yesterday	Summary: The transformer models have been extensively used with good results in a wide area of machine learning applications including Large Language Models and image generation. Here, we inquire on the applicability of this approach to financial time series. We first describe the dataset construction for two prototypical situations: a mean reverting synthetic Ornstein-Uhlenbeck process on one hand and real S&P500 data on the other hand. Then, we present in detail the proposed Transformer architecture an... Keywords: machine learning, time series, transformer
Enhancing Portfolio Optimization with Transformer-GAN Integration: A Novel Approach in the Black-Litterma... arxiv.org \| Yesterday	Summary: This study presents an innovative approach to portfolio optimization by integrating Transformer models with Generative Adversarial Networks (GANs) within the Black-Litterman (BL) framework. Capitalizing on Transformers' ability to discern long-range dependencies and GANs' proficiency in generating accurate predictive models, our method enhances the generation of refined predictive views for BL portfolio allocations. This fusion of our model with BL's structured method for merging objective views... Keywords: framework, transformer, network, generative, optimization
Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically arxiv.org \| Yesterday	Summary: Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures without explicitly encoding any structural bias. In this work, we investigate sources of inductive bias in transformer models and their training that could cause such generalization behavior to emerge. We extensively experiment with transformer models trained on multiple synthetic datasets and with different training objectives and show th... Keywords: transformer, coding
Asking and Answering Questions to Extract Event-Argument Structures arxiv.org \| Yesterday	Summary: This paper presents a question-answering approach to extract document-level event-argument structures. We automatically ask and answer questions for each argument type an event may have. Questions are generated using manually defined templates and generative transformers. Template-based questions are generated using predefined role-specific wh-words and event triggers from the context document. Transformer-based questions are generated using large language models trained to formulate questions b... Keywords: transformer, generative

Please log in to see more search results.

transformer - 20 / 63