python – A simple attention based text prediction model from scratch using pytorch

I have created a simple self attention based text prediction model using pytorch. The attention formula used for creating attention layer is,

enter image description here

I want to validate whether the whole code is implemented correctly, particularly my custom implementation of Attention layer.

The whole code

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

import random

# Sample text for Training
test_sentence = """Thomas Edison. The famed American inventor rose to prominence in the late
19th century because of his successes, yes, but even he felt that these successes
were the result of his many failures. He did not succeed in his work on one of his
most famous inventions, the lightbulb, on his first try nor even on his hundred and
first try. In fact, it took him more than 1,000 attempts to make the first incandescent
bulb but, along the way, he learned quite a deal. As he himself said,
"I did not fail a thousand times but instead succeeded in finding a thousand ways it would not work." 
Thus Edison demonstrated both in thought and action how instructive mistakes can be. 

# Build a list of tuples.  Each tuple is (( word_i-2, word_i-1 ), target word)
trigrams = (((test_sentence(i), test_sentence(i + 1)), test_sentence(i + 2))
            for i in range(len(test_sentence) - 2))

# print the first 3, just so you can see what they look like

vocab = list(set(test_sentence))
word_to_ix2 = {word: i for i, word in enumerate(vocab)}

# Number of Epochs

# SEQ_SIZE is the number of words we are using as a context for the next word we want to predict

# Embedding dimension is the size of the embedding vector

# Size of the hidden layer

class Attention(nn.Module):
    A custom self attention layer
    def __init__(self, in_feat,out_feat):
        self.Q = nn.Linear(in_feat,out_feat) # Query
        self.K = nn.Linear(in_feat,out_feat) # Key
        self.V = nn.Linear(in_feat,out_feat) # Value
        self.softmax = nn.Softmax(dim=1)

    def forward(self,x):
        Q = self.Q(x)
        K = self.K(x)
        V = self.V(x)
        d = K.shape(0) # dimension of key vector
        QK_d = (Q @ K.T)/(d)**0.5
        prob = self.softmax(QK_d)
        attention = prob @ V
        return attention

class Model(nn.Module):
    def __init__(self,vocab_size,embed_size,seq_size,hidden):
        self.embed = nn.Embedding(vocab_size,embed_size)
        self.attention = Attention(embed_size,hidden)
        self.fc1 = nn.Linear(hidden*seq_size,vocab_size) # converting n rows to 1
        self.softmax = nn.Softmax(dim=1)

    def forward(self,x):
        x = self.embed(x)
        x = self.attention(x).view(1,-1)
        x = self.fc1(x)
        log_probs = F.log_softmax(x,dim=1)
        return log_probs

learning_rate = 0.001
loss_function = nn.NLLLoss()  # negative log likelihood

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# Training
for i in range(EPOCHS):
    total_loss = 0
    for context, target in trigrams:
        # context, target = ('thomas', 'edison.') the
        # step 1: context id generation
        context_idxs = torch.tensor((word_to_ix2(w) for w in context), dtype=torch.long)

        # step 2: setting zero gradient for models

        # step 3: Forward propogation for calculating log probs
        log_probs = model(context_idxs)

        # step 4: calculating loss
        loss = loss_function(log_probs, torch.tensor((word_to_ix2(target)), dtype=torch.long))

        # step 5: finding the gradients

        #step 6: updating the weights

        total_loss += loss.item()
    if i%2==0:
        print("Epoch: ",str(i)," Loss: ",str(total_loss))

# Prediction
with torch.no_grad():
    # Fetching a random context and target 
    rand_val = trigrams(random.randrange(len(trigrams)))
    context = rand_val(0)
    target = rand_val(1)
    # Getting context and target index's
    context_idxs = torch.tensor((word_to_ix2(w) for w in context), dtype=torch.long)
    target_idxs = torch.tensor((word_to_ix2(w) for w in (target)), dtype=torch.long)
    print("Acutal indices: ", context_idxs, target_idxs)
    log_preds = model(context_idxs)
    print("Predicted indices: ",torch.argmax(log_preds))

spellcasting – I need help with distance scaling in a homebrew RPG – from “your body” to “the whole planet” in 10 simple steps

Years ago, I developed and play tested a simple RPG system, in which magic-alike powers were driven by 2 parameters, namely Power and Precision.

  • With high precision and minimal power you could burn a specific card
    in a standard 52 cards deck without charring any other card.

  • With high power and minimal precision you could easily burn a house,
    maybe even a village, but with a risk of burning the wrong one.

    There were also specific techniques to be used, but that’s besides the point for now.

Scales were non-linear, each step about doubled or tripled what you could do (I intend to post it as a separate review question). The idea behind that was to make characters who allowed themselves to have flaws also be more and more shining with their strength, and to make “balanced” characters objectively worse.

Playtesting was showing that it worked reasonably well, but what we lacked in the magic department was Distance – how far a PC would be able affect things with their spells and minds. What we decided would be OK was the non-linear progression from only being able affect one’s own body and the things he touches, to the whole globe. A high Precision, high Distance, minimal Power kit would make a great messenger-type character or seer. Self distance would be good for shapeshifters, touch for healers, etc.

We got this rough sketch we never playtested properly, but it was in storytelling terms and had a big hole in it. I was wondering if / how to turn it into numerical values, like meters or kilometers, preferably with a simple equation and not just arbitrary table.

  1. One’s own body
  2. Touch
  3. Immediate area
  4. Village
  5. Town or City
  6. A continent
  7. Whole world

Numerical or not, this progression seems totally uneven. I expect and want each step to be bigger than previous one, but I want them to feel natural, to feel “not arbitrary” for the lack of better words. (I am not a native English speaker).

Problem: how to make this progression feel more natural?

By “feel natural” I mean that your average player wouldn’t be shocked or taken aback by it. Ideal answer would point to other systems that already solved similar problem, your own playtested homebrew experiences, or design guidelines from a reputable source I could use. Seeing your specific solution, be it table or equation, is less important than the process that lead to it, or one that confirmed it works.

Note: 0 means none whatsoever in this system. 0 power is obvious. 0 control means you can’t even decide if it’s on or off (basically a death sentence unless other one is 0). 0 on distance means your power cannot reach beyond your imagination, your own mind. A lot of children games teaches a basic mental discipline, so Control 1 is free.

Decomposition of Lie algebra: do the simple and maximal torus parts commute?

I have the following exercise:

Consider a Lie algebra $mathfrak{g}$. Decompose $mathfrak{g}$ using the Levi decomposition, so $mathfrak{g}=mathfrak{s}oplus mathfrak{r}$. Let $mathfrak{a}$ be a maximal torus inside the radical. Is it always true that $(mathfrak{s},mathfrak{a})=0$? Discuss which hypotesis are needed to obtain such a result.

Now I started assuming that $mathfrak{g}$ is algebraic (to avoid exceptional or strange cases, if any), and I procede as suggested, so I get
$$mathfrak{g}=mathfrak{s}oplus mathfrak{a}oplusmathfrak{m}$$
for some complement $mathfrak{m}$ of $mathfrak{a}$ in the radical $mathfrak{r}$. For sure $(mathfrak{s},mathfrak{a})subset mathfrak{r}$. But then I don’t have any idea, nor I can prove it is true always (I guess it is false).

For the second question, I can think only of “trivial” answers: it is true for solvable Lie algebras and for semi-simple Lie algebras.


Subgroup rank of finite simple groups

Definition: The subgroup rank of a finite group G is the minimal natural number n such that every subgroup of G can be generated by n elements (or fewer).

This invariant has been studied extensively for various families of groups. I am interested in the family of finite simple groups and I have been unable to find and relevant information in the literature.

Question 1: Are there only finitely-many finite simple, non-abelian groups G of a given subgroup rank n?

Some relatively straight-forward comments and reductions:

It is not too difficult to show that there are only finitely-many alternating groups of subgroup rank at most n (by explicitly constructing elementary-abelian subgroups of a certain subgroup rank). There are also only finitely-many sporadic groups, according to the classification. These observations reduce the above question to finite simple groups of Lie-type.

Question 2: Are there only finitely-many finite simple groups G of Lie-type with given subgroup rank n?

It is again not too difficult to show that the “field rank” of G is bounded from above by a function of n (by looking at the natural homomorphism from the field to the root subgroups). It is also possible to show that the Lie-rank of G is bounded from above by a function of n. These observations further reduce question 2 to bounding the defining characteristic of the simple group of Lie-type by some function that depends only on the subgroup rank n. Unfortunately, I do not have any good intuition to determine whether the latter statement is true or not.

I hope both questions have a positive answer because that would give us a nice property about the FSG. But I suspect we can prove the answers to be “no” by simply making some judicious choice for the Lie-type, field-rank, and Lie-rank and by then looking at the structure of the Sylow-subgroups of G, as the characteristic goes through the different primes.

development – How to develop a Simple data entry form with 3 to 4 questions in SharePoint 2013

Your privacy

By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.

javascript – Does this class creator for a simple text game seem convoluted?

I would prefer something that would be along my lines for the simplification, area of introduction to Jquery and arrow functions, though I want to get better, and this is the best way I know how.

I want it to be also a bit more efficent.

function GeneratePersonWithList(amountOfItems) {
  var items = ()
  var shL = ()
  var m = getRandomInt(1000)
  function getRandomInt(max) {
    return Math.floor(Math.random() * max);
  class People {
    constructor(money, sL) { = m
      this.sL = sL = shL + " "
    shoppingListUpdate() {

  class item {
    constructor(id, name, price) { = id = name
      this.price = price
      items.push((id, name, price))
  function itemCreation() {
    new item(0, "Candy Bar", 1)
    new item(1, "Gum", 0.5)
    new item(2, "Apple", 5)
    new item(3, "Banana", 6)
  function shoppingList(amountOfItems) {
    for (i = 0; i < getRandomInt(amountOfItems) + 1; i++) {
      shL.push(" " + items(getRandomInt(items.length))(1))
  let PeopleNames = "./PeopleNames.json"
  let request = new XMLHttpRequest();'GET', PeopleNames)
  request.responseType = 'json';
  request.onload = function () {
    const Names = request.response;
    function Genders(Gender) {
      if (Gender === 0)
        return Names.firstNamesF(getRandomInt(999)) + " " +

      else if (Gender === 1) {
        return Names.firstNamesM(getRandomInt(999)) + " " +
    Generator = Genders(getRandomInt(2));
    function getRandomInt(max) {
      return Math.floor(Math.random() * max);
    var div = document.getElementById("name")
    div.innerHTML = Generator
  var Person = new People()
  var div = document.getElementById("1")
  div.innerHTML = `$${} n`
  var div = document.getElementById("2")
  div.innerHTML = `Shopping list:${Person.sL}`
var but = document.getElementById("List")
but.addEventListener('click', function () { GeneratePersonWithList(10) })

ct.category theory – Recovering an abelian category from the Ext of its simple objects

Let $C$ be an abelian category, assume for simplicity that $C$ is enriched over $Vect_k$ (vector spaces over $k$) for some fixed field $k$.

Suppose also that $C$ is both Artinian and Noetherian, so that for any object $X$ there is a sequence of objects $0=X_0 hookrightarrow ldots hookrightarrow X_n = X$ with $X_i/X_{i-1}$ simple. Finally suppose that $C$ has enough injective/projective objects so that $operatorname{Ext}_C$ can be defined.

Given $C$, we build a new category $S$, enriched over graded $k$-vector spaces, in the following way:

  • The objects of $S$ are the simple objects of $C$
  • If $X,Yin Ob(S)$ then $operatorname {Hom}_S(X,Y) = bigoplus_{ngeq0}operatorname{Ext}^n_C(X,Y)$
  • Compositions of morphisms are defined using the natural maps $operatorname{Ext}^n_C(X,Y)otimes operatorname{Ext}^m_C(Y,Z)tooperatorname{Ext}^{n+m}_C(X,Z)$

My question is: Can we recover $C$ from $S$ (say up to equivalence)?

Assuming the answer is “yes”, I guess that there is an analogue for when $C$ is only enriched over $Ab$, maybe if we redefine $S$ so that $operatorname{Hom}_S(X,Y)=operatorname{Hom}_{D(C)}(X,Y)$ or something

algorithms – Find all vertices that are included in a simple cycle through a fixed vertex in a directed graph

Given a directed graph $G = (V, E)$ and a vertex $v in V$, how to find all vertices $v’$ such that exists a simple cycle $v to … to v’ to … to v$? That is, to find the set of vertices $$V’ = {v’ : exists c, text{c is a simple cycle in G}, v in c, v’ in c }$$

I found this question related: Find all cycles through a given vertex. I can first find all cycles and then union them to get $V’$. However, is there a more efficient way to do so?

dnd 5e – What’s a simple way to handle the ignition of flammable objects by spells?

A lot of my party’s (3rd level) casters have something of a predilection for the Firebolt cantrip. Like some other fire-damage spells, Firebolt’s description specifies that:

A flammable object hit by this spell ignites if it isn’t being worn or carried.

I’d love to give my players some of the flavourful utility that this description encourages: ‘You miss the goblin, but you hit the crate he’s hiding behind and now it’s on fire.’

I’m aware of the general rules for hitting and destroying objects, which are covered under Statistics for Objects on p.246 of the DMG. I’m specifically interested in how these rules can be best applied (or overhauled) for the purpose of setting flammable objects on fire in a way that’s fun, intuitive and easy to manage. Specific questions I have include:

  • Should flammable objects be vulnerable to fire damage?
  • Should fire-damage spell attack rolls have advantage against flammable objects?
  • How should the ongoing damage of something being ‘on fire’ be adjudicated?
  • Is the descriptor ‘flammable’ applied to objects at the DM’s discretion?

cache – How to solve simple wp simple ajax chat caching problem

This might be a solution but I have not tested it.

go to: wp-admin/admin.php?page=litespeed-cache#excludes

Or navigate litespeed->cache->excludes

<IfModule LiteSpeed>
 RewriteEngine On
 RewriteCond %{REQUEST_URI} ^/filename.php
 RewriteRule .* - [E=cache-control:no-cache]

replace filename with the plugin filename.