group policy – Windows Server 2016 Active Directory: Granting Admin to a User for only specific machines

I have granted local admin to several machines for an active directory user who requested / needed this access. I did not want to grant admin to the entire domain and server, but instead to only specific terminals.

I found a post online about granting local administrator for specific machines using group policy management and group policy management editor. (Found below)

I created an OU (i.e. contains All Computers in domain) and created another OU (i.e. contains Specific Computers) inside that contains the computers I wanted to grant admin access for this user to.

I applied the custom Local Admin GPO and linked it to the Specific Computers OU inside of the All Computers OU. This seems to have worked to grant Local admin to those specific computers for this user…

My question being, what If I wanted to at another time give someone else admin access only to 2-3 of the computers inside the the Specific computers OU?

Is there an easier way to do this? I played around with a security group (added specific computers) and tried to have this be the method to grant admin access but could not figure it out.

Please let me know what you think.

Thank you!

Password policy for elderly clientele

I work for a company in which the age of our average user is over 70. We have an app that helps them collect and submit physiological data to their doctors.

I found this question that I believe is helpful if you’re helping your mother set a password: Secure Memorable Passwords for Older Users

However, we’re struggling to develop a policy for our 5000+ users, particularly given these additional wrinkles:

  • The users’ accounts are set up at the doctor’s office by a
    non-technical medical professional that probably thinks "Dog123" is a
    good password. We can educate them about password complexity, but
    getting them to similarly educate users on-site is a different
  • Many of our users don’t have an email address, making it infeasible to send a password reset email
  • Password managers are also infeasible, because we can’t expect our medical staff to be setting up LastPass for the users (especially with no email address)
  • This is medical data, with all the regulation that comes with it.

Any suggestions for a password policy that secures our sensitive data without frustrating and driving away our entire user base?

Active Directory Group Policy – Allow Outlook through proxy settings

I am trying to allow access to Outlook(MS Office Standard 2016) access through an Active Directory Group Policy proxy I have created.

I am allowing a specific set of websites to a specific set of workstations. I have made this work by:

a) removing Firefox, Chrome, and IE11 from the workstation, leaving only Edge, and

b) in Group Policy(User Config->Preferences->Windows Settings->Registry), I have configured registry keys to set up a dummy proxy server, and then override it to allow the specific websites through. It may not be pretty, but it is working.

When I have set up the above configuration, I find that Outlook is showing a “Disconnected” status with this Group Policy. What parameters in the ProxyOverride need to be configured to allow Outlook access through the proxy?

Windows Server 2016/Active Directory —
Windows 10 environment —
Outlook(MS Office Standard 2016)

Thanks for your input!


chrome-extension as source of CSP policy violation

Here is an extract from a CSP report

(csp-report) => Array
        (blocked-uri) => chrome-extension
        (column-number) => 27329
        (document-uri) =>
        (line-number) => 3
        (referrer) =>
        (source-file) =>
        (violated-directive) => font-src

Here is the list of font sources that are legitimate (according to the policy):


I am trying to understand what happened that gave rise to the CSP policy violation.
I note, too, that when I visit the page concerned, everything appears normal and no violation of the policy occurs.

So, here is how I interpret the report, but I am not at all sure of my interpretation. So I am asking for help in understanding it.

  1. A visitor is referred to mypage having found it via a Google search.
  2. The visitor performs some manipulation that triggers a call to a function in wp-includes/js/jquery/jquery.js?ver=1.12.4-wp (it is a WordPress site, as you might guess)
  3. That function attempts to load a font from a source that is not allowed as per my policy (I do not display the policy here, for reasons of confidentiality)
  4. The CSP font-src policy blocks the loading of the font and generates the report above (redacted).


  1. Is my understanding of the events correct?
  2. Why is the blocked uri called “chrome-extension”? Does this imply that the font source is expected to be local to the server? Or local to the client?
  3. If, for some reason, I wanted to change the policy accept whatever font was attempted to be loaded, how would I do that, given that chrome-extension is not a valid entry to the list of sources?

group policy – Can I force the BITS service to Automatic (and not turning off) on Windows 10 Pro?

My problem is that I provide support for a Win10 computer where the BITS service works perfectly (no problems with Windows Update) but which runs a third party program with a (flawed) update mechanism that can’t handle that BITS isn’t constantly on (or at least on when the program starts). From searching I understand BITS itself, Windows Update or some other mechanism turns it off and sets it startup type to Manual when it’s been idle for a while and that this is the intended behavior, so just setting it to Automatic (or Automatic (Delayed Start)) doesn’t help.

The not so helpful provider of the third party program suggests that we start it manually but the user of that computer doesn’t have admin rights. So my question is whether there is some way to force BITS to be on constantly?

All I’ve managed to find searching the net is references to Group policies that don’t seem to exist in Windows 10 (they are probably only present in server versions) and they only mention controlling the startup type (which may or may not prevent the service from stopping itself after a few minutes of inactivity).

mac – Trouble running .jar files (System Policy: deny(1) file-read-data)

Recently I needed to run some .jar files. Unfortunately each time I ran them an information box would show up saying “The Java JAR file “name” could not be launched.” as well as telling me to check console for error messages. So I tried completely removing java and reinstalling it and the error continued to happen. Finally I checked console and saw that each time I tried to open a .jar file it said “System Policy: deny(1) file-read-data)” in console.

Does anyone know what this error is and how I can fix it? Is it an issue with the jar launcher? Ive had it happen with multiple completely different jar files.

Thanks for the help.

content security policy – Why only script-src unsafe-inline is reported as an high severity finding?

I’m evaluating a CSP policy using The policy is configured as follow:

default-src 'self';object-src 'self';script-src 'self' 'unsafe-inline' 'unsafe-eval';script-src-elem 'self' 'unsafe-inline' 'unsafe-eval';script-src-attr 'self' 'unsafe-inline';

Why only 'unsafe-inline' of 'script-src' is reported as an high severity finding? From what I understood also 'script-src-elem' 'unsafe-inline' could be dangerous. What am I missing?

python – Implementation of Policy Gradient Reward Design paper

I’ve implemented the first experiment from the Reward Design via Online Gradient Ascent paper. I don’t have any specific concerns, but it’s my first time using multiprocessing or doing reinforcement learning, and I want to add this work to my portfolio. So I want to know if there is anything wrong with this code or if it can be improved in any way. The number of trials is 13 instead of 130 like in the paper because I don’t have that much compute.

Main file:

import numpy as np
from agent import Agent
from environment import BirdEnv
from pgrd import PGRD
from gym.utils.seeding import np_random
from multiprocessing import Pool
import os

# 5 actions: move right, left, down, up, eat the worm
#the agent observes the full state given by 9*agent_location + worm_location
TAU = 100
GAMMA = 0.95

if __name__ == "__main__":
    rng_env, _ = np_random()
    env = BirdEnv(rng_env)
    for depth in range(7):
        for alpha in (0, 2e-6, 5e-6, 2e-5, 5e-5, 2e-4, 5e-4, 2e-3, 5e-3, 1e-2):
            for beta in (0, 0.4, 0.7, 0.9, 0.95, 0.99):
                def run_trial(num_trial):
                    rng_agent, _ = np_random()
                    agent = Agent(depth, TAU, GAMMA, rng_agent, NUM_ACTIONS, NUM_STATES)
                    model = PGRD(agent, env, alpha, beta)
                    return model.learn(total_timesteps=TOTAL_TIMESTEPS, visualize=False)
                pool = Pool(os.cpu_count())
                    returns =, np.arange(num_trials))
                    returns = np.sum(np.array(returns), axis=0) / num_trials
      "results/Result_depth_{}_alpha_{}_beta_{}.npy".format(depth, alpha, beta), returns)

import numpy as np
from collections import defaultdict
from functools import lru_cache

def softmax(action_values, tau):
    Arguments: action_values - 1-dimensional array
    tau - temperature
    preferences = action_values * tau
    max_preference = np.max(preferences)
    exp_prefs = np.exp(preferences - max_preference)
    return exp_prefs / np.sum(exp_prefs)

class Agent:
    def __init__(self, depth, tau, gamma, rng, nA, nS):
        self.nA = nA
        self.nS = nS
        self.depth = depth #depth of planning
        self.tau = tau #policy temperature
        self.gamma = gamma #discount rate
        #agent's model of the environment
        #N(s)(a) =  {total: total_visits, 'counts': {s': x, ...}}
        #N(s)(a)(s') - number of visits to s' after taking action s in state a
        #N(s)(a)(s`) / N(s)(a)(total) = Pr(s`|s, a)
        self.N = defaultdict(lambda: defaultdict(lambda: {'total':0, 'counts': defaultdict(lambda:0)}))
        self.rand_generator = rng
    def update(self, state, action, newstate):
        self.N(state)(action)('total') += 1
        self.N(state)(action)('counts')(newstate) += 1

    def plan(self, state, theta):
        """ Compute d-step Q-value function and its theta-gradient at state""" 
        def _plan(self, state, d):
            """ Recursive memoized function"""
            reward_grad = np.zeros((self.nA, self.nS, self.nA))
            for a in range(self.nA):
                reward_grad(a,state,a) = 1
            if d == 0:
                action_values = theta(state)
                value_grad = reward_grad
                inc = np.zeros(self.nA)
                grad_inc = np.zeros((self.nA, self.nS, self.nA))
                for action in self.N(state).keys():
                    for state_next, count in self.N(state)(action)('counts').items():
                        values_next, grad_next = _plan(self, state_next, d-1)
                        action_next = np.argmax(values_next)
                        p = count / self.N(state)(action)('total')
                        inc(action) += values_next(action_next) * p
                        grad_inc(action, state_next, action_next) += np.argmax(values_next) * p

                action_values = theta(state) + self.gamma * inc
                value_grad = reward_grad + self.gamma * grad_inc
            return action_values, value_grad
        return _plan(self, state, self.depth)
    def logpolicy_grad(self, value_grad, probas, action):
        value_grad: nA x nS x nA
        probas: nA
        action: int
        grad: nS x nA
        grad = self.tau * (value_grad(action) - np.tensordot(probas, value_grad, axes=1))
        return grad
    def policy(self, action_values):
        probas = softmax(action_values, self.tau)
        return probas

    def step(self, state, theta):
        action_values, value_grad = self.plan(state, theta)
        # compute the Boltzman stochastic policy parametrized by action_values
        probas = self.policy(action_values) #shape: nA
        # select action according to policy
        action = self.rand_generator.choice(np.arange(self.nA), p=probas)
        grad = self.logpolicy_grad(value_grad, probas, action)
        return action, grad

import sys
import numpy as np
from collections import defaultdict

MAP = ("CCC",
       " ==",
       " ==",

class BirdEnv:
    """Bird looks for a worm"""
    metadata = {'render.modes': ('human')}

    def __init__(self, rng):
        self.nA = 5 #number of actions: right, left, down, up, eat the worm
        self.nC = 9 #number of cells in 3x3 grid
        self.nS = self.nC**2 # state = (position of bird, position of worm)
        self.ncol = 3 #number of columns in 3x3 grid
        self.rand_generator = rng
        #transitions(c)(a) == ((probability, nextcell),..)
        self.transitions = {c : {} for c in range(self.nC)}
        def move(i, j, inc):
            cell_i = max(min(i + inc(0), 4), 0)
            cell_j = max(min(j + inc(1), 2), 0)
            #move according to action, if you can
            if MAP(cell_i)(cell_j) == "=":
                cell_i = i
            elif MAP(cell_i)(cell_j) == " ":
                cell_i += inc(0)
            cell = 3 * (cell_i // 2) + cell_j
            return cell
        for i, row in enumerate(MAP):
            for j, char in enumerate(row):
                if char == "C":
                    d = defaultdict(lambda:0)
                    for inc in ((0,1), (0, -1), (1, 0), (-1,0)):
                        cell = move(i,j,inc)
                        d(cell) += 0.025
                    for action, inc in enumerate(((0,1), (0, -1), (1, 0), (-1,0))):
                        cell = move(i,j,inc)
                        trans = d.copy()
                        trans(cell) += 0.9                     
                        self.transitions(3*(i//2)+j)(action) = ((prob, nextcell) for nextcell, prob in trans.items())
        #initial cell distribution (always start in the upper left corner)
        self.icd = (1, 0, 0, 0, 0, 0, 0, 0, 0)
        self.cell = self.rand_generator.choice(np.arange(self.nC), p=self.icd)
        #initial worm distribution: in one of the three right-most locations at the end of each corridor
        self.iwd = (0, 0, 1./3, 0, 0, 1./3, 0, 0, 1./3)
        self.worm = self.rand_generator.choice(np.arange(self.nC), p=self.iwd)
        self.lastaction = 4
    def state(self):
        return self.nC * self.cell + self.worm
    def step(self, action):
        """Execute one time step within the environment"""
        reward = 0
        if action == 4:
            #try eating the worm
            if self.cell == self.worm:
                #move worm into one of the empty cells on the right
                self.worm = self.rand_generator.choice(((self.worm + 3) % self.nC, (self.worm + 6) % self.nC))
                reward = 1
            transitions = self.transitions(self.cell)(action)
            i = self.rand_generator.choice(np.arange(len(transitions)), p=(t(0) for t in transitions))
            _, cell = transitions(i)
            self.cell = cell
        self.lastaction = action
        state = self.state()
        return state, reward

    def reset(self):
    # Reset the state of the environment to an initial state
        self.cell = self.rand_generator.choice(np.arange(self.nC), p=self.icd)
        self.worm = self.rand_generator.choice(np.arange(self.nC), p=self.iwd)
    def render(self, mode='human', close=False):
    # Render the environment to the screen
        outfile = sys.stdout
        desc = (("C", "C", "C"), ("C", "C", "C"), ("C", "C", "C"))
        row, col = self.cell // self.ncol, self.cell % self.ncol
        desc(row)(col) = "B"
        row, col = self.worm // self.ncol, self.worm % self.ncol
        desc(row)(col) = "W"
        if self.lastaction is not None:
            outfile.write("  ({})n".format(
                ("Right", "Left", "Down", "Up", "Eat")(self.lastaction)))
        outfile.write("n".join(''.join(line) for line in desc)+"n")

import numpy as np
class PGRD:
    def __init__(self, agent, env, alpha, beta):
        self.agent = agent
        self.env = env
        self.alpha = alpha #step size
        self.beta = beta 
        #theta is initialized so that the initial reward function = objective reward function
        self.theta = np.zeros((env.nS, env.nA))
        for cell in (2,5,8):
            self.theta(10*cell, 4) = 1
        #variable to store theta gradient
        self.z = np.zeros((env.nS, env.nA))

    def learn(self, total_timesteps, visualize=False):
        state = self.env.state()
        total_reward = 0
        returns = ()
        for i in range(total_timesteps):
            if visualize:
            action, grad = self.agent.step(state, self.theta)
            newstate, reward = self.env.step(action)
            total_reward += reward
            #update agent's model of the environment:
            self.agent.update(state, action, newstate)
            state = newstate
            #update theta
            self.z = self.beta * self.z + grad
            self.theta += self.alpha * reward * self.z
            #cap parameters at +-1:
            self.theta = np.maximum(self.theta, -1)
            self.theta = np.minimum(self.theta, 1)
            returns.append(total_reward / (i+1))
        return np.array(returns)

import numpy as np
#print environment transitions:
def print_env(env):
    def pretty(d, indent=0):
        for key, value in d.items():
            print('t' * indent + str(key))
            if isinstance(value, dict):
                pretty(value, indent+1)
                print('t' * (indent+1) + str(value))
    pretty(env.transitions, indent=1)

def test_value_grad(agent):
    theta = np.random.rand(agent.nS, agent.nA)
    delta_theta = 1e-2 * np.random.rand(agent.nS, agent.nA)
    state = 0
    values1, grad1 = agent.plan(state, theta)
    values2, grad2 = agent.plan(state, theta + delta_theta)
    assert np.allclose(values2 - values1, np.tensordot(grad1, delta_theta, axes=2))
def test_policy_grad(agent):
    theta = np.random.rand(agent.nS, agent.nA)
    delta_theta = 1e-3 * np.random.rand(agent.nS, agent.nA)
    state = 0
    for action in range(5):
        values1, value_grad1 = agent.plan(state, theta)
        logprobas1 = np.log(agent.policy(values1))
        values2, value_grad2 = agent.plan(state, theta + delta_theta)
        logprobas2 = np.log(agent.policy(values2))
        grad = agent.logpolicy_grad(value_grad1, agent.policy(values1), action)

        assert np.allclose(logprobas2(action) - logprobas1(action), (grad * delta_theta).sum())

virtual machines – Change Azure VM Update Policy

I have multiple VMs running on Azure but noticed that the Windows update policy differs. Unfortunately, I cannot locate the setting in the Azure portal. Below is a screenshot from the first VM that has the option “Never check for updates” configured.

enter image description here

The other VM has the option “Install updates automatically using Windows Update” configured.

enter image description here

When I click on the it, it tells me that the settings are managed by our organization because of Azure. However, where do I change that? I cannot find anything in the portal and I have been searching for hours. I would appreciate some help.

websockets – How to do loadbalancing for collaborative editing with multi-az and geoproximity based routing policy

I am planning a project which has users collaboratively editing a document. To provide good latency to users, I am planning to deploy in multiple AWS regions with active-active mongodb replicas with geoproximity based routing policy in AWS Route53. This plan conflicts with another requirement where users in a region (say Asia) should be able to work collaboratively on the same document with users from a different region (say North America), as users will get routed to the servers closest to them, and will work off of different databases.

On thinking a bit about the problem, the solution seems like writing a custom load-balancer which routes to the appropriate deployment based on the request-data and some queries to the db. Am I correct in my thinking? Also, how’d I handle forwarding web-socket connections to the correct server?