Neural networks: transfer of learning with different input sizes

Thank you for contributing a response to Mathematica Stack Exchange!

  • Please make sure answer the question. Provide details and share your research!

But avoid

  • Ask for help, clarifications or respond to other answers.
  • Make statements based on opinion; Support them with references or personal experience.

Use MathJax to format equations. Reference MathJax.

For more information, see our tips on how to write excellent answers.

machine learning: are chatbots a good example of overfitting?

In my high school class we are learning about Artificial Intelligence, and especially the problems that come with machine learning. I was wondering if chatbots like cleverbot were good examples of overfitting. Obviously, when you receive an answer that makes no sense, is it the result of the bot modeling a set of training data too closely, or is it something completely different?
Thanks for the help

Neural network: what techniques made training in deep learning possible?

I have read in many texts that in the early days of neural network computing, backpropagation was not successful for deep networks and that computing power was very limited to run simulations. Today, with all deep networks of tens or hundreds of layers, we still use backward propagation for training. What was the problem with backward propagation in the old days and what has been changed in training to make it feasible for new deep networks? Only computing power?

Learning – Counting correct questions

Can someone explain to me the function of the line "question + = 1"I am a beginner and I am trying to understand the logic behind the code and not just make it work.

> pontos = 0 questao = 1 while questao <=3:
>     resposta = input('Resposta da questão {}: '.format(questao))
>     if questao == 1 and resposta == 'b':
>         pontos = pontos + 1
>     if questao == 2 and resposta == 'a':
>         pontos = pontos + 1
>     if questao == 3 and resposta == 'd':
>         pontos = pontos + 1
>     questao +=1 print('O aluno fez {} pontos'.format(pontos))

machine learning: ClassifierFunction output dependent on current $ ContextPath

I am trying to use Classify predict heads in a Mathematica expression from the heads of their arguments, which may be functions whose heads are not in the current $ContextPath. With the default method "LogisticRegression", the behavior is as expected: the output does not depend on $ContextPath.

a = Test`testA;
b = Test`testB;

clf = Classify({{a}->1, {b}->2});

clf({a}, "Probabilities")
Block({$ContextPath = {"Test`"}}, clf({a}, "Probabilities"))
clf({"foo"}, "Probabilities")

<|1 -> 0.999679, 2 -> 0.000321061|>
<|1 -> 0.999679, 2 -> 0.000321061|>
<|1 -> 0.999997, 2 -> 3.21752*10^-6|>

However, when I configure "Method" -> "Markov", the symbol is treated as equivalent to an unknown symbol when its context is activated ContextPath.

clf2 = Classify({{a}->1, {b}->2}, "Method" -> "Markov");

clf2({a}, "Probabilities")
Block({$ContextPath = {"Test`"}}, clf2({a}, "Probabilities"))
clf2({"foo"}, "Probabilities")

<|1 -> 0.916597, 2 -> 0.0834028|>
<|1 -> 0.5, 2 -> 0.5|>
<|1 -> 0.5, 2 -> 0.5|>

As SequencePredict Always use Method -> Markov, always has the second behavior. This seems to be a problem, since $ContextPath Changes when loading packages.

  • Is this a mistake? (I already contacted Wolfram Research).
  • What is the root cause? Hash does not depend on $ContextPathand overload ToString It has no effect.
  • Is there a fancy workaround other than the input hash manually (that doesn't work well for SequencePredict)? Attaching each use of the ClassifierFunction in Block({$ContextPath = {"System`"}},...) it does not work

machine learning – Grouping – Complete example of linking drawing

I am studying unsupervised learning methods (clustering) and I have seen the Full Link Method. I have also seen the following statement:

Unlike the single link, the full link method can be strongly affected by raffle cases (where there are 2 groups / groups with the same distance value in the distance matrix).

I would like an example where this occurs and, if possible, an explanation of why this occurs.

machine learning: how can I get a NetTrainResultsObject after NetTraining with a NetGraph?

Apparently, the result of NetTrain is a NetResultsObject that you should be able to check with options such as "ValidationLoss". After training, my object seems to be stuck in NetResultsObject ("FinalNet") mode, which is the default. I didn't see anything in the manual that prohibits what I am trying to do. Any advice?? Thank you!

The following does not work:

net = NetGraph ({layers}, {NetPort (Input1), NetPort (Output1), connections}, "Input1" -> {1.83})

trained = NetTrain (NetInitialize (net), ..)

After a successful training:

trained ("ValidationLoss")

NetGraph :: invindata2: The data supplied to the "Input1" port was not a 1 * 83 array of real numbers (or a list of these).
$ Failed

I started photography and I am still learning, please give feedback

I'm on Instagram as @ jeremy78perez. Install the application to follow my photos and videos.

Machine learning: perform LSTM in several columns separately

I have an LSTM neural network to test the predictability of this network and it works for a column. But now I want to use several columns for different elements and calculate the & # 39; ABSE & # 39; For each column. For example if I have two columns:

enter the description of the image here

You need to calculate the function & # 39; ABSE & # 39; for each column separately.

My code below fails. Can someone help me?

This is what I tried, but I get a value error:

ValueError: non-broadcastable output operand with shape (1,1) doesn't 
match the broadcast shape (1,2)

This happens on the line:

 ---> 51     trainPredict = scaler.inverse_transform(trainPredict)

the code:

def create_dataset(dataset, look_back=1):
    dataX, dataY = (), ()
    for i in range(len(dataset)-look_back-1):
        a = dataset(i:(i+look_back), 0)
        dataY.append(dataset(i + look_back, 0))
        return numpy.array(dataX), numpy.array(dataY)

def ABSE(a,b):
    ABSE = abs((b-a)/b)
    return numpy.mean(ABSE)

columns = df(('Item1','Item2'))

for i in columns:
    # normalize the dataset
    scaler = MinMaxScaler(feature_range=(0, 1))
    dataset = scaler.fit_transform(dataset)
    # split into train and test sets
    train_size = int(len(dataset) * 0.5)
    test_size = 1- train_size
    train, test = dataset(0:train_size,:), 
    look_back = 1
    trainX, trainY = create_dataset(train, look_back)
    testX, testY = create_dataset(test, look_back)
    trainX = numpy.reshape(trainX, (trainX.shape(0), 1, trainX.shape(1)))
    testX = numpy.reshape(testX, (testX.shape(0), 1, testX.shape(1)))
    # create and fit the LSTM network
    model = Sequential()
    model.add(LSTM(1, input_shape=(1, look_back)))
    model.compile(loss='mean_squared_error', optimizer='adam'), trainY, epochs=1, batch_size = 1, verbose = 0)
    # make predictions
    trainPredict = model.predict(trainX)
    testPredict = model.predict(testX)
    # invert predictions
    trainPredict = scaler.inverse_transform(trainPredict)
    trainY = scaler.inverse_transform((trainY))
    testPredict = scaler.inverse_transform(testPredict)
    testY = scaler.inverse_transform((testY))
    # calculate root mean squared error
    trainScore = ABSE(trainY(0), trainPredict(:,0))
    print('Train Score: %.2f ABSE' % (trainScore))
    testScore = ABSE(testY(0), testPredict(:,0))
    print('Test Score: %.2f ABSE' % (testScore))

Machine learning – Retrieve Boolean vector of dot products


I want to determine a boolean vector $ b en {0,1 } ^ n $ It consists of zeros and ones, but cannot access directly. I can only call a black box computer code that will take the product point of $ b $ with a real value vector $ v in mathbb {R} ^ n $ of my choice that is, access to $ b $ It is available through map evaluation.
$$ v mapsto b ^ T v. $$
How can I retrieve all entries from $ b $ Using as few of these points products as possible? (Maybe just a 1 point product?)

Next, I detail a couple of ideas that I had that could work in theory, but that don't work in practice (I think). To specify, one can assume that $ n approx 1 text {million} $, and arithmetic is done in double precision floating point format. This question arose as a subproblem in a machine learning application.

Idea 1:

One idea I had was to use a vector with fast-growing entries. Let's say, for example, $ n = $ 9. Then we could use the vector
$$ v = begin {bmatrix} 1 & 10 & 100 & 1000 & dots end {bmatrix} ^ T. $$
Then you could read $ b $ as the digits of $ b ^ T v $. The problem with this solution is that the numbers grow so fast, that in finite precision computational arithmetic will not work for large $ n $.

Idea 2:

Another idea I had was to use a vector with inputs that are algebraically independent. Then determining $ b $ since $ b ^ Tv $ It is a subset sum problem.

For example, yes $ n = $ 3 Y
$$ v = begin {bmatrix} pi & e & 1 end {bmatrix} ^ T, $$
so $ b ^ T v $ it will take one of a finite number of possibilities,
$$ b ^ T v in { pi, ~ e, ~ 1, ~ pi + e, ~ pi + 1, ~ e + 1, ~ pi + e + 1 }. $$
We can determine which of these is the case, thus determining $ b $.

But this seems quite combinatorial, and therefore not feasible for large $ n $.