Machine learning: model suggestion for malware detection based on multiple API call sequences

I am trying to build an RNN model (LSTM) for the classification of binaries as benign / malware. The data structure that I currently have is seen as follows

"binary1": {
"tag": 1,
"sequences": [
            ["api1","api2","api3", ...],
            ["api1","api2","api3", ...],
            ["api1","api2","api3", ...],
            ["api1","api2","api3", ...],
"binary2": {
"tag": 0,
"sequences": [
            ["api1","api2","api3", ...],
            ["api1","api2","api3", ...],
            ["api1","api2","api3", ...],
            ["api1","api2","api3", ...],

Here, each binary has a variable number of sequences, and each sequence has a variable number of API calls.
I can fill in the data so that all the binaries have the same number of sequences and each sequence also has the same number of API calls.
But my question is, how can I use this data for training?

The problem is that all the sequences of the malicious binary may not be malicious sequences. So, if I use the label and I indicate the model that all those sequences are malicious and if some of the sequences are also similar in benign files, the benign binary can be treated as malware.

To better understand the problem, treat each binary as a person on Twitter, and each API call sequence as a word in a tweet. A user can tweet so many tweets, but some of them can be about sports (for example). And in my training data, I know what people tweet about sports, but I do not know what tweets are about sports. So, what I'm trying to do is classify those people if they like sports or not based on all the tweets of the person.

In the same way, I know if the binary is malicious or not, but I do not know what API call sequences are responsible for the malice. And I want the model to identify those sequences from the training data. It's possible? And what architecture should I use?

I hope I have transmitted my question, thanks for reading and waiting for a suggestion.

Machine Learning: How can you train a neural network in Mathematica to recognize and count all the occurrences of a certain personalized category in an image?

I am trying to use Mathematica for the following task:

(1) Obtain an aerial view of a specific area in Mongolia

(2) Count the number of parcels containing houses and the number of parcels containing only gers (traditional yurts) in that area, and not houses.

  • The houses are typically rectangular colored areas (their ceilings), while the gers appear as white discs with a dark dot in the middle.
  • The plots in Mongolia are usually surrounded by walls or wooden fences to keep the 65 million free animals (goats, sheep, cows, horses) that belong to the nomadic population outside the facilities.

An example for an interesting area to examine would be:

Geoimage[{GeoPosition[{47.95, 106.78}],
GeoPosition[{47.97, 106.82}]}
GeoRangePadding -> None,
Geo Projection -> "Mercator",
Image size -> 2500]

As you can see, GeoImage works very well for (1).
However, nice and convenient built-in functions like ImageCases and ImageBoundingBox are of no help for (2), as they seem to recognize only object categories, coded by Wolfram in the underlying NNs. Unfortunately, those do not include views of birds and gers' houses.

I have no difficulty in solving this problem elegantly with Tensorflow in Python just by following one of the many tutorials available on the web. However I can not understand how to do this in Mathematica. Is it possible at all?

Any help / suggestion would be much appreciated.

Machine learning: what is the official name of a specific type of combination algorithm?

Let's say I have the following set of variables:


The values ​​represent a list of variables in a data set. Each variable has a certain level of correlation with the objective variable. $. The correlation increases or decreases depending on the type of combination you use against your goal $. For example, BCJX could have a greater correlation with $ what OQTVW. I will test every possible combination with the training algorithm and show it all, with its precision score, in a concise CSV file. But I do not know the name of the algorithm that could combine these variables in all possible ways.

In other words, I want to find the combination with the highest correlation value and the smallest dimension.

computability: how can the VC dimension of the Turing machine be finite?

The VC dimension of a class of hypotheses. $ mathcal {H} $ It is defined as the size of the maximum set. $ C $ such that $ mathcal {H} $ can not be sealed This document shows that the VC dimension of the set of all Turing machines with $ n $ states is $ Theta (n log n) $.

However, suppose we take the set of all those Turing machines, to $ n $ large enough for the universal Turing machine to be a member of $ mathcal {H} $. The result indicates that there is a set. $ C $ (wlog, $ C subset {0,1 } ^ * $) in size, let's say, $ n ^ 2 $, such that $ mathcal {H} $ can not be broken To my knowledge, it means that there is no member of $ mathcal {H} $ that can calculate the function
f (x) = begin {cases} 1 text {if} x in C (1) \
0 text {else} end {cases}

Where $ C (1) $ These are the points labeled "1".

But $ C $ it is finite therefore $ f $ It is clearly computable, therefore there is some Turing machine. $ M_C $ that computes it, therefore $ M_C $ can be simulated by the universal Turing machine, which is in $ mathcal {H} $and this is a contradiction. Where is the problem with this argument?

command line – Browsh in a VSphere virtual machine is unreadable

I recently installed Browsh on a virtual machine running on Vsphere. I am connecting to the virtual machine using the integrated VSphere console viewer and I am trying to use Browsh to access some websites. The problem is that when I write in terminal browsh The exit is quite ugly and unusable. Is there a way to fix it without using a different ssh / console client? This is a screenshot of my output:
enter the description of the image here

As you can see, I can really work with this version of the website, I can not even find out where the search field is or insert any text inside it. I forgot to mention that this is on Ubuntu 18.10. I am also using browsh-1.6.4.

Machine learning: can hidden Markov models be used for real-time analysis?

From what I understand, HMMs construct an underlying sequence of states to maximize the probability of a sequence of observations. As far as I can tell, that should make its use inadequate before the full sequence is known, but maybe it is not understanding something and it is possible, after all.

Example of why I think this does not work:

A, B
0.9 remain in the same state, 0.1 switching status
A: 0.7 "Says A", 0.3 "Says either"
B: 0.6 "says B", 0.4 "says either of the two"

If we receive the sequence of observations ("Says A", "Days Oither"), we must assume that the states are (A, A), because the first must be A and for the second 0.9 * 0.3 = 0.27> 0.1 * 0.4 = 0.04.
However, if we then receive a third observation "Say B", we should update our estimate to (A, B, B), because there has to be a change somewhere, and the probability of observing "Says one of the two" is greater if the switch passes before
Obviously that does not work if we already classify the second state as A in real time.

Am I missing something or do the HMM just not work for the real-time classifiers?

Windows: virtual machine that consumes dual CPU resources in the host operating system

I am using Hyper-V on the Windows 10 Pro operating system and the virtual operating system is Windows Server 2012.
I am using only one VM at a time to verify the CPU utilization of the VPS with respect to the HOST operating system. Note : VM has the same kernels as the hardware (There is no limitation on any resource)

I do not know why the CPU utilization of the Host OS is almost double that of the VM.

37% CPU in VPS and 69% CPU in operating system

As shown in the previous image, the use of the CPU of the virtual machine is only 37%, while the CPU usage of the host operating system is 68%.

Result in several tests (1 VM at a time):

** VM CPU usage (%): Host CPU usage (%) **
13%: 35%
37%: 69%
50%: 96%

I have tried the above in VMWare In addition, the result is almost the same.

I am using only one running VM at a time and there is no limitation on any hardware resources. I have tested it on various hardware and different virtual machines. The result is almost the same.

According to the practical results. Does it mean that I can only use 50% of the HOST OS (Hardware) CPU using Hyper-V or VMWare?

logic – Can the undecidability theorems detected by a machine?

this question was originally written in mathoverflow, but a comment recommended that I rewrite it as a CS question.

This is not a mathematically formalized question. I'm sorry for that, but I think it's more of a math than a philosophy.

When we prove a theorem A that says "B is undecidable", we do not try to prove neither B nor (not B). Can a machine do the same? Can you detect the "meaning" of a statement, such as "something is undecidable"?

Here is a reason why I do not believe it.

Let's say a prayer

universal_Turing_machine (program, input, output)

is true if and only if the output resulting from "program" with "input" given is "output". Of course, if the program does not stop, it would be false for any "input" and "exit".

Now, let x be a Godel number of a sentence. Consider the following sentence:

there is no such
and is a Godel number of a string that ends with a sentence coded as x,
and universal_Turing_machine (program, and, true)

If the program acts as a "decision program that accepts valid proofs", this sentence obviously means "a sentence coded as x is not demonstrable". If not, this sentence does not mean undecidability. Therefore, if a machine can detect undecided theorems, it has to detect programs that act as "decision programs that accept valid tests"

But according to Rice's theorem, the detection of programs that have a specific property is not possible.

Do you think this "reason" makes sense? Since this is not a pure mathematical question, I hope to hear your opinions. Thank you.

finder – Elimination of the Spotlight index for a Time Machine unit that no longer exists

Suppose an old Time Machine unit is connected to a new macOS installation and configured as the main backup storage (but a backup of the new position has not been initiated). Spotlight started indexing the old disk, as usual.

How do I erase the indexing data and those related to this disk, if they have been deleted successively (ExFAT) with the Disk Utility?

networks – Forwarding internet access to a remote machine

Unlike the forwarding of the Internet connection through SSH to a Linux console, my REMOTE machine is not behind a firewall.


  • A machine that is connected locally only to the SERVER and a pair of machines within the same switch in 192.168.. IP range

SERVER ( and

  • A server that is connected to an internal network shared with my LAPTOP and other machines (in 10.0.. IP range)
  • The server is also connected to a local switch that is not in the internal network, but that allows the connection to REMOTE (in 192.168.. IP range)
  • It has an internet connection.


  • My machine that can access the SERVER through an internal network shared with the SERVER and other machines, but not REMOTE
  • It has an internet connection.

To access the remote control from the laptop, I usually do two jumps, first to SERVER and then to remote control

$ ssh
$ ssh

My question is how to forward all connections to and from REMOTE to the Internet through the SERVER?

I do not have physical access or GUI to both SERVER and REMOTE, so I can only do it through the LAPTOP and the command line.