Clark Hale’s Blog

K8s PVs, NFS and ACLs and avoiding umask changes.

2024-01-19T11:59:00-05:00

I ran into an issue recently with my Tekton Pipelines with PersistentVolume Permissions.

I needed my pods to create files with the group write bit set. Essentially, I needed a umask of 0002, but my pods were running with umask of 0022.

I was afraid I was going to have start twiddling around with umasks for my Tekton tasks, but instead found that I could solve my issue using Linux Access Control Lists (ACLs).

ACLs are not often used on Linux, but can be quite powerful. For this article, the relevant feature is the ability to set default permissions for created files. These ignore the current processes umask and act as a directory specific umask.

The Setup

My PersistentVolumes were NFSv4 shares exported from a RHEL8 host, so I would need to set these ACLs on directories exported from that server.

Here is the normal behavior:

nfs-server#  cd /path/to/nfs_share

nfs-server# umask
0022      <-  Files should be created rwxr-x-r-x 

nfs-server# getfacl .  
# file: .
# owner: root
# group: root
# flags: -s-
user::rwx
group::rwx
other::r-x


nfs-server# touch without_default_perms_acl
# ls -l
total 0
-rw-r--r--.  1 root root   0 Jan 19 11:44 without_default_perms_acl

Now, let me set default permissions via an ACL:

nfs-server# cd /path/to/nfs_share

# Let's set our default ACLs for group and other.

nfs-server# setfacl -m d:g::rwX . 
nfs-server# setfacl -m d:o::r-X .

nfs-server# getfacl .
# file: .
# owner: root
# group: root
# flags: -s-
user::rwx
group::rwx
other::r-x
default:user::rwx
default:group::rwx
default:other::r-x

nfs-server# umask
0022      <-  Our umask remains unchanged

nfs-server# touch with_default_perms_acl
nfs-server# ls -l 
total 0
-rw-rw-r--. 1 root root 0 Jan 19 11:50 with_default_perms_acl
-rw-r--r--. 1 root root 0 Jan 19 11:44 without_default_perms_acl

As you can see, the file created AFTER setting the default ACLs ignores the umask and instead uses the default set in the ACL! New subdirectories will also inherit these ACLs.

When this directory is exported by NFSv4, this ACL is honored by the client.

For more information, see acl(5) and setfacl(1).

A word of warning, not all NFS servers will support this, and I’m fairly certain you must be using NFSv4 for this to work. I’ve tested it with a RHEL8 nfs-server, but YMMV for other nfs servers.

WHY would I need to do this?

OpenShift, unique among K8s distributions, semi-randomizes the UID and GID of it’s pods. This is a security measure that adds barriers for attackers.

When two pods need to share a PV, we need to carefully construct permissions so that they share a common group and that group has write permissions on all files.

Normally this is done by:

Changing the group of the root of the NFS share to 0 (root) or a defined common group
Setting the setgid-bit, so that all new files and directories inherit that group.
Setting the umask to 0002, so that group read-write is set on all files.

Step #1 and #2 above are fairly straight forward, but setting the umask on every pod in every scenario can be difficult, especially if you’re dealing with pre-created container images and many images (like in a tekton pipeline)

Using default ACLs, instead of relying on umask, allows me to make one change, on the server side, that is automatically followed by all pods.

Renaming RAID Devices

2022-06-17T00:05:00-04:00

I learned recently that software RAID devices under Linux can have friendly names. I had never specified a name when creating arrays.

I’ve been recently updating my fileserver to RHEL 8 and working with the original array I created in 2015 and documented in my post: RAID 6 and LVM. In the original version of that post, I did not specify a name.

I’ve found that naming my RAID volumes has a great benefit when reinstalling the system via Kickstart, which I’ll write about more in detail later.

So, my RAID volumes are unnamed….how do I give them a name?

First, let’s look at our RAID volume using the mdadm --detail command:

# mdadm --detail /dev/md200
/dev/md200:
           Version : 1.2
     Creation Time : Thu Sep 24 17:47:54 2015
        Raid Level : raid6
        Array Size : 3906521088 (3.64 TiB 4.00 TB)
     Used Dev Size : 976630272 (931.39 GiB 1000.07 GB)
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Jun  9 03:11:45 2022
             State : clean 
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : 200
              UUID : 6d90c19a:9812e659:6854747b:a89f1280
            Events : 34710

    Number   Major   Minor   RaidDevice State
       0       8       97        0      active sync   /dev/sdg1
       1       8       17        1      active sync   /dev/sdb1
       2       8       49        2      active sync   /dev/sdd1
       3       8      113        3      active sync   /dev/sdh1
       4       8       65        4      active sync   /dev/sde1
       5       8       81        5      active sync   /dev/sdf1

Above, there is a name field, but it’s not really a human name, just 200. This is really just the /dev/md### number.

Renaming is a relatively simple operation, but it can’t be done live.

In my setup, I have a LVM Volume Group called galadriel. I want to rename my RAID device to also be galadriel to match.

Run mdadm --detail and save the output somewhere for reference.
Unmount all volumes related to either the RAID device or volume group using the RAID.
Stop the volume group. -a n deactivates the volume group.
```
 vgchange -a n galadriel
```
Stop the md device using mdadm
```
 sudo mdadm --stop /dev/md200
```
Reassemble with new name. This reassembles the drive with the --name. The device names can be gleaned from the output of `mdadm –detail.
```
 sudo mdadm --assemble --update=name --name=galadriel /dev/md200 /dev/sdg1 /dev/sdb1 /dev/sdd1 /dev/sdh1 /dev/sde1 /dev/sdf1
```
Reactivate the volume group
```
 vgchange -a y galadriel
```

Check that everthing is OK, by looking at /dev/mdstat, comparing new output of mdadm --detail, vgs, and lvs.

 # cat /proc/mdstat
 md200 : active raid6 sdf1[1] sdh1[2] sdd1[3] sda1[4] sdc1[0] sdb1[5]
       3906521088 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/6] [UUUUUU]
       bitmap: 0/8 pages [0KB], 65536KB chunk

It may be required to update /etc/mdadm.conf. In RHEL7+, the UUID is used, so only the dev path may need to be updated:

# cat /etc/mdadm.conf
    
   ### Snippet of relevant line:
ARRAY /dev/md/galadriel level=raid6 num-devices=6 UUID=6d90c19a:9812e659:6854747b:a89f1280

Fixing Fedora’s default FreeIPA config

2022-06-06T14:05:00-04:00

I periodically re-install Fedora on my laptops. Now that I have a fairly stable FreeIPA setup, I’ve been joining my laptops to FreeIPA during installation:

This works great for the first user, i.e. the one that is specified in the installer. However, other users are not able to log in after installation.

To make reading easier, I’ve put firstuser and seconduser as my two users. I’ve also clipped out hostnames and timestamps from journalctl output.

If attempting to log in as one of these other users via ssh, the following messages are logged in the journal:

sshd[928089]: pam_sss(sshd:auth): authentication success; logname= uid=0 euid=0 tty=ssh ruser= rhost=172.31.0.50 user=seconduser
sshd[928089]: pam_sss(sshd:account): Access denied for user seconduser: 6 (Permission denied)
sshd[928089]: Failed password for seconduser from 172.31.0.50 port 42840 ssh2
sshd[928089]: fatal: Access denied for user seconduser by PAM account configuration [preauth]

Failed password for seconduser makes this seem like a password issue, but it’s not. An actual bad password looks like this:

krb5_child[979500]: Preauthentication failed
krb5_child[979500]: Preauthentication failed
sshd[979494]: pam_sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=172.31.0.50 user=seconduser
sshd[979494]: pam_sss(sshd:auth): received for user seconduser: 7 (Authentication failure)
sshd[979494]: Failed password for seconduser from 172.31.0.50 port 39754 ssh2

The real error, is actually coming from PAM and we can see the difference from my scenario vs the true wrong password. If we look at the first example, the real error is:

pam_sss(sshd:account): Access denied for user seconduser: 6 (Permission denied)

compare to the wrong password scenario, where the error is this:

pam_sss(sshd:auth): received for user seconduser: 7 (Authentication failure)

From this, we can see that a true wrong password fails in the auth modules of PAM (as is expected), but my error happens in the account modules.

So, one of the PAM account modules is denying my user.

Looking at /etc/pam.d/password-auth, which is a symlink to /etc/authselect/password-auth, we see:

account     required                                     pam_unix.so
account     sufficient                                   pam_localuser.so
account     sufficient                                   pam_usertype.so issystem
account     [default=bad success=ok user_unknown=ignore] pam_sss.so
account     required                                     pam_permit.so

Don’t really see anything that would allow one user, but deny an other.

pam_sss.so is the module that interfaces with FreeIPA via sssd. It sources it’s configuration in /etc/sssd/sssd.conf, and there we find our culprit:

From sssd.conf:

[domain/mydomain.example.com]
simple_allow_users = $, firstuser

sssd-simple(5), provides very simple user and group allow/deny lists. In this case, if a user is not listed in simple_allow_users, then they are not allowed to login.

What this means is, only firstuser is allowed to log in. Adding seconduser to the list will allow them to log in, and so forth.

This seems like a bug. At the least it’s very unexpected behavior!

OpenShift authentication with IPA

2022-05-11T14:05:00-04:00

OpenShift and IPA Series

This post is part of a collection of blog posts related to OpenShift and FreeIPA (aka idM).

OpenShift Certificate from IPA on RHEL 8
OpenShift authentication with IPA
OpenShift Group Syncing with IPA (Not yet published)
Automated Certificate Management with IPA and cert-manager (Not yet published)

Introduction

In the previous post, OpenShift Certificate from IPA on RHEL 8, I explained how to create certificates for an OpenShift cluster using FreeIPA/RHEL idM.

Now, I want to be able to log into my OpenShift cluster using my FreeIPA credentials.

OpenShift Background

OpenShift has the concept of an IdentityProvider which connects an source of identity verification, like FreeIPA’s LDAP server, to OpenShift.

OpenShift 4.x uses an object called an OAuth to configure identity providers like LDAP. For OpenShift 3.x, this configuration is similar, but locaed in /etc/origin/master/master-config.yaml

Let’s look at an example OAuth manifest:

apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
  - ldap:
      attributes:
        email:
        - mail
        id:
        - dn
        name:
        - cn
        preferredUsername:
        - uid
      bindDN: uid=openshift,cn=sysaccounts,cn=etc,dc=private,dc=opequon,dc=net
      bindPassword:
        name: ldap-secret 
      ca:
        name: opequon-custom-ca-oauth
      insecure: false
      url: ldaps://ipa.private.opequon.net/cn=users,cn=accounts,dc=private,dc=opequon,dc=net?uid
    mappingMethod: claim
    name: ldapidp
    type: LDAP

This is an example of a LDAP Identity provider, there are other flavors if desired. Notice OAuth.spec.identityProviders looks for an array so it is possible to specify multiple providers.

We have a few things that need filled out here:

name
bindDN
bindPassword
ca
url
attributes
mappingMethod

Name

This is the display name for this IdentityProvider. When logging on, if multiple IdentityProviders are configured, the user will see this name to choose from. In the case of only having a single provider, this is almost never seen.

Whatever value for name is, it should be something meaningful to the humans that interact with this OpenShift cluster.

bindDN and bindPassword

The username and password for accessing the cluster. bindPassword is normally sourced from a Kubernetes Secret.

ca

The certificate chain for the LDAP server. Almost every LDAP server uses a certificate signed by a non-public certificate authority (i.e. not included in the default RHEL certificate authority bundle), therefore this is almost always required.

Since FreeIPA acts as a certificate authority, this will need to be specified in the final configuration.

This is normally sourced from a ConfigMap.

url

This is the URL of the LDAP server with a query string, as defined by RFC4516 (which obsoletes, RFC2254). The query string is, along with the attributes, the most important part fo the configuration. If it is wrong, then no one can log in, and it is significantly easier to get wrong than the other fields.

The RFCs are rather dense reading, so it’s worthwhile to break down the parts of a query string:

schema://hostname:port/base_dn?attributes?scope?filter
                       ^                             ^
                       +-----------------------------+
                              Query String

After the schema, hostname, and port, there are four fields:

base_dn
attributes
scope
filter

LDAP organizes it’s data into a tree-like structure. base_dn specifies where in the tree to start searching. This can be visually seen by tools like Apache Directory Studio.

Apache Directory Studio Screenshot

attributes specify what attributes from the found entities to return from the search. An entity is a node on the tree that represents something, like a user account or a group. These, of course, can have many attributes. Some common examples are mail for e-mail address, sn for surname, givenName, loginShell, homeDirectory.

For OpenShift, this should be a attribute that is going to be unique to each entity in the entities returned by the query, like uid which represents a Linux username.

Again, a tool like Apache Directory Studio can be very helpful exploring the LDAP structure.

scope determines if or how the search recurses through the tree. The options are:

base
one
sub

The default is sub

base restricts itself to ONLY the base_dn and is not really useful in this situation.

one searches only the first level below the base_dn.

sub fully recurses the tree below the base_dn.

The LDAP Wiki has a good page explaining LDAP Search Scopes.

Finally, filter is an LDAP Filter that allows refinement of the resultset even further. This is a really deep topic, but a simple example of this is (objectclass=person). This filter will only return entities that are of objectclass person.

If no filter is specified, then (objectclass=*) is the default. This filter is true for all entities.

The LDAP Wiki has a lot of good examples of LDAP Queries.

With this information, we can create a URL for IPA. Fortunately, for the default install of FreeIPA, we can have a fairly simple query string.

From the example:

ldaps://ipa.private.opequon.net/cn=users,cn=accounts,dc=private,dc=opequon,dc=net?uid

This example omits some parameters. If we were to fill in the default explicitly, it would look like this:

ldaps://ipa.private.opequon.net/cn=users,cn=accounts,dc=private,dc=opequon,dc=net?uid?sub?(objectclass=*)

In FreeIPA, all users are always listed under cn=users,cn=accounts,dc=mydomain. This differs some other LDAP systems (specifically Active Directory). This makes our search really simple, as we can just pull every entity out of this base_dn and be assured that all users will be able to log into OpenShift.

An Aside: Restricting Access

In this example, the LDAP URL is constructed so that all users can log in to OpenShift.

In many situations, there is a desire to lock this down to allow, for example, only users from a certain group to log in or other similar restriction.

While this is certainly possible, with the correct LDAP filter. I would posit that this is an anti-pattern, and that access control is much better co-ordinated via Groups and RBAC policies.

I have several reasons for this:

It’s imperative that the LDAP query returns quickly. This query is run every time a user logs in. If the query takes a long time to run, then the user experience will suffer.
Changing the OAuth object can potentially cause loss of service, especially if updated with incorrect values.

If you configure a very complicated query (especially if using nested group search on a complicated LDAP tree), then #1 is very possible.

And, secondly, if you specify a complicated query it increases the chances of having to change the query in the future, which increases the changes for mistakes when updating the OAuth object.

Therefore, my opinion is to let everyone log in. It’s a simple query and is unlikely to every change, unless there are massive change to the LDAP structure.

In that situation, controlling access is done via Groups and Role-Based Access Control. This is easier to implement, less risky to change, and much more flexible. A future blog post will cover this topic in detaill.

attributes

When a user logs into OpenShift using an IdentityProvider, a set of proxy Kubernetes object is created to represent that user. These objects are used predominately in role based access control and group membership, as well as in creation metadata for certain other objects.

The attributes field configures the IdentityProvider to map attributes in the LDAP entity to fields in these proxy objects.

There are four attributes, let’s go over them quickly:

name
email
id
preferredUsername

name is a human readable name of the user. In Free IPA, this is in the cn attribute of the LDAP entity.

email is, obviously, the user’s email address. In Free IPA, this is in the mail attribute of the LDAP entity.

id is the most important of these attribute, as it is meant to map to a LDAP field that uniquely identifies the user. As a consequence, this attribute should never change during the life of the LDAP entity. If it does change then the link between the user entity in LDAP and the User object in OpenShift will be broken, and the user will no longer be able to log in.

In Free IPA, the id attribute should be mapped to the dn attribute of the LDAP entity.

preferredUsername is optional, but useful. By default, the IdentityProvider will use id as the username for that user in OpenShift. Unfortunately, dn is a big long, ugly LDAP Distinguished Name, like this: uid=cfh,cn=users,cn=accounts,dc=private,dc=opequon,dc=net.

preferredUsername, if specified, causes a different attribute to be used for username. Generally, I want to use the same username as I use for logging into Linux systems. In Free IPA, this is in the uid attribute of the LDAP entity.

mappingMethod

When multiple IdentityProviders are configured in a single cluster, then the mappingMethod is set to determine how username conflicts are handled.

In the case where you only have one IdentityProvider configured, then claim is the right value.

Putting it all together

Pre-requisites

Service Account

I need a read-only service account in my FreeIPA LDAP. This is not a regular user. I don’t want it to be able to log into any hosts or even the IPA console. I only want it to be able to do queries against FreeIPA’s LDAP.

To do this, create a LDAP System Account.

All these examples use the base domain of my system (dc=private,dc=opequon,dc=net). You’ll obviously need to swap this out with the specifics of your FreeIPA installation.

On the FreeIPA server:

[root@ipa ~]# ldapmodify -x -D 'cn=Directory Manager' -W
Enter LDAP Password: 
# Paste in the below
dn: uid=openshift,cn=sysaccounts,cn=etc,dc=private,dc=opequon,dc=net  <- CHANGE OPENSHIFT
changetype: add                                                          TO DESIRED NAME OF
objectclass: account                                                     SERVICE ACCOUNT
objectclass: simplesecurityobject
uid: openshift          <-  THIS SHOULD MATCH THE DN ABOVE
userPassword: changeMe  <-  THIS IS YOUR SERVICE ACCOUNT PASSWORD
passwordExpirationTime: 20380119031407Z <- 2038 EFFECTIVELY THE END OF TIME
nsIdleTimeout: 0

^D  <- ACTUALLY TYPE Control-D

If successful, you should see the following output:

adding new entry "uid=openshift,cn=sysaccounts,cn=etc,dc=private,dc=opequon,dc=net"

You can then test this service account using a command similar to the following:

ldapsearch  -x -D 'uid=openshift,cn=sysaccounts,cn=etc,dc=private,dc=opequon,dc=net' -W

This will spit out every object in the LDAP database. Of course you can filter, see ldapsearch(1).

For both the ldapmodify and ldapsearch command, the arguments mean the following:

-x - use simple authentication (instead of a Kerberos ticket)
-D - the distinguished name of the user
-W - prompt for password

Certificate Authority

This file is normally found at /etc/ipa/ca.crt on any machine connected to FreeIPA.

Grab this file, so it can put it into a config map later, or alernatively run oc commands from a Linux machine that is connected to FreeIPA.

OpenShift Configuration

Creating Secrets and ConfigMaps

The OAuth object we are creating requires a Secret for the LDAP service account password.

This secret must be in the openshift-config Project, but can be named anything so long as that name is used in the later OAuth object.

To do so via the command line:

# oc project openshift-config 
# oc create secret generic ldap-secret --from-literal=bindPassword=changeMe

This will create a Secret like the below. Alternatively, this manifest can be applied directly using oc create, just be sure to substitute the value of data.bindPassword with the Base64 encoded string of your actual password.

apiVersion: v1
kind: Secret
metadata:
  name: ldap-secret
  namespace: openshift-config
type: Opaque
data:
  bindPassword: Y2hhbmdlTWU=

In addition to the Secret, the FreeIPA certificate authority chain must be in a ConfigMap object in the openshift-config namespace. This file is normally found at /etc/ipa/ca.crt on any machine connected to FreeIPA.

To create this ConfigMap via the command line:

oc project
oc create cm custom-ca-oauth --from-file=ca.crt=/etc/ipa/ca.crt -n openshift-config

Alternatively, this manifest can be applied directly using oc create.

apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-ca-oauth
  namespace: openshift-config
data:
  ca.crt: |
    -----BEGIN CERTIFICATE-----
    REPLACE ME
    -----END CERTIFICATE-----

Creating OAuth Object

With the LDAP service account created and the Secrets and ConfigMaps in place, it’s time to create to update the OAuth object.

In OpenShift 4, an OAuth object will exist by default regardless of if there are any IdentityProviders configured. We could use oc replace to overwrite this object or just edit the object using either oc edit or the web console.

Regardless of method chosen, the resulting object definition should look like this:

apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
  - ldap:
      attributes:
        email:
        - mail
        id:
        - dn
        name:
        - cn
        preferredUsername:
        - uid
      bindDN: uid=openshift,cn=sysaccounts,cn=etc,dc=private,dc=opequon,dc=net
      bindPassword:
        name: ldap-secret       # MUST MATCH OUR SECRET FOR LDAP
      ca:
        name: custom-ca-oauth   # MUST MATCH OUR CA CONFIGMAP
      insecure: false
      url: ldaps://ipa.private.opequon.net/cn=users,cn=accounts,dc=private,dc=opequon,dc=net?uid
    mappingMethod: claim
    name: ldapidp
    type: LDAP

bindDN and url must, of course, be updated to match your environment. The names of the Secret and Certificate Authority ConfigMap must match the names of those created in the previous section.

Once this is applied, the authentication ClusterOperator will apply the configuration you can check the status by looking at the ClusterOperator objects (short name for these is co)

# oc get co authentication
NAME             VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication   4.10.12   True        False         False      24h     

While being configured, the PROGRESSING column will flip to false. This configuration should only take a few minutes to apply.

If it the DEGRADED column flips to true for a long period (or if AVAILABLE becomes false), then use oc describe co authentication to see events related to the problem.

Testing Logging in

After the authentication ClusterOperator applies the configuration, on next log in attempt, the user will be presented with an option to log in using FreeIPA as the provider. The name of this provider is specified in the OAuth object spec.identityprovider.ldap.name field. In our example, this is ldapipa.

OpenShift Login Screen

Once logged in, we should be able to see our User object.

# oc get users
NAME   UID                                    FULL NAME    IDENTITIES
cfh    826d995a-2045-42dd-a4fa-81d338ebbefe   Clark Hale   ldapidp:dWlkPWNmaCxjbj11c2Vycyxjbj1hY2NvdW50cyxkYz1wcml2YXRlLGRjPW9wZXF1b24sZGM9bmV0

Additionally, we can see the Identity object

[root@meriadoc opequon_labs_ocp_playbooks]# oc get identity
NAME                                                                                   IDP NAME   IDP USER NAME                                                                  USER NAME   USER UID
ldapidp:dWlkPWNmaCxjbj11c2Vycyxjbj1hY2NvdW50cyxkYz1wcml2YXRlLGRjPW9wZXF1b24sZGM9bmV0   ldapidp    dWlkPWNmaCxjbj11c2Vycyxjbj1hY2NvdW50cyxkYz1wcml2YXRlLGRjPW9wZXF1b24sZGM9bmV0   cfh         826d995a-2045-42dd-a4fa-81d338ebbefe

We can see that these long strings are actually the value of the id attribute, which we set to dn:

[root@meriadoc opequon_labs_ocp_playbooks]# echo "dWlkPWNmaCxjbj11c2Vycyxjbj1hY2NvdW50cyxkYz1wcml2YXRlLGRjPW9wZXF1b24sZGM9bmV0" | base64 -d
uid=cfh,cn=users,cn=accounts,dc=private,dc=opequon,dc=net

Next Steps

Now that users can log in, the next step is to synchronize groups from FreeIPA and then assigning Roles to those users.

OpenShift Certificate from IPA on RHEL 8
OpenShift authentication with IPA
OpenShift Group Syncing with IPA (Not yet published)
Automated Certificate Management with IPA and cert-manager (Not yet published)

Removing labels or annotations from K8s objects using Ansible

2022-03-25T02:42:00-04:00

Just a small little trick worth recording. To remove a label, annotation, or likely any other key from a Kubernetes object using Ansible’s k8s module, you must set that key to NULL.

    - name: Remove Annotation Blah
      k8s:
       api_key: ""
       state: present
       definition:
         apiVersion: v1
         kind: Node
         metadata:
           name: varda.private.opequon.net
           annotations:
             blah: NULL

The above snippet removes the blah annotation from this node object.

Certainly not intuitive…took me a good day or two of searching when I need this last year.

Relative Paths in AmigaDOS Commands

2022-03-25T01:00:00-04:00

I’ve revived my old Amiga computers and have been playing around with them.

The AmigaDOS is an interesting piece of history. It was based upon TRIPOS, which was initially released in 1978. UNIX was released initially in 1973ish, so it’s unclear to me how much influence UNIX had on TRIPOS and subsequently AmigaDOS.

I have a suspicion TRIPOS was relatively free of UNIX influence, based on the absence of . and .. for relative file access.

For the uninitiated, . and .. are special files that occur in every UNIX directory. . refers to the current directory and .. refers to the parent directory. The root directory, i.e. /, is the only special case. Since it has no parent, root’s .. refers to itself.

What this means is, if I have a tree like this:

$ tree
test
├── a
│   └── file1
├── b
│   └── sub
│       ├── file2
│       └── subsub
└── c
    └── sea_file

I can easily refer to files in OTHER directories than my current working directory.

### My current working directory is test.
$ pwd
/test

### To refer to a file under test I can use '.' to refer to '/path/to/test'
$ cat ./a/file1
      ^
      +------------- The full for file is /test/.
                     but it refers to /test

### If I change my working directory to b/sub
$ cd b/sub

### I can still easily refer to files in other directories:

$ cat ../../a/file1
      ^  ^
      |  +----- .. of /test/b/, which refers to /test/
      +-------- .. of /test/b/sub which refers to /test/b/

This concept is so essential in UNIX/Linux that was at a bit of a loss when I started using the AmigaDOS shell again.

In UNIX, I’m very used to doing this:

cp -r /path/to/some/dir/* .

To copy everything in /path/to/some/dir/ to the current directory.

In AmigaDOS, I reflexively ran a similar command (note: #? is equivalent to *):

5.Ram Disk:> copy work:test/#? .
   .   [created]
   a..copied.
   b..copied.

This created a directory called . and copied my files into it! NOT WHAT I WANTED!

5.Ram Disk:> dir
     . (dir)

This leads me to wonder….how does one do relative pathing in the Amiga Shell? If I used to know years ago, I’ve forgotten.

There are basically three characters that end up providing similar functionality to . and .. in UNIX:

:
/
""

Rather than having a unified file tree, AmigaOS has volumes each with their own name (similar to CP/M or VMS and derivative operating systems).

Volumes names come from three sources:

File system labels, e.g. Workbench:, the normal volume name of the bootable AmigaOS partition.
Device names, e.g. DF0: for the first floppy disk.
Assigns which map to directories in volumes to volume names, e.g. C: which normally goes to SYS:C and SYS: which maps to the boot volume.

There’s only one name space for volumes, so multiple names could map to the exact same storage location. However, generally, the AmigaDOS Shell will show the file system label.

For example, the first floppy disk has a device name of DF0:, which never changes no matter which disk is inserted. A disk will have a file system label like Workbench: and if that disk is booted from, SYS: is also assigned to the disk. All three of these volume names point to the same storage.

Additionally, Assigns do not work the same when they point to sub directories of other volumes. If I do this:

5.Ram Disk:> cd c:
5.Workbench:C>

Notice my current working directory changes to what C: references, rather than have it appear as if it were an actual volume.

With that in mind, let’s return to my example and assume I have a volume name test: with the following tree:

test:
├── a
│   └── file1
├── b
│   └── sub
│       ├── file2
│       └── subsub
└── c
    └── sea_file

So long as my current working directory is somewhere under test:, I can can list the files in the root of test: like this:

5.TEST:c> dir :
     a (dir)
     b (dir)
     c (dir)

If my current working directory is test:b/sub, I can copy test:c/sea_file to there like so:

5.TEST:b/sub> dir
  file2   subsub
5.TEST:b/sub> copy //c/sea_file ""
5.TEST:b/sub> dir
  file2     sea_file  subsub

In this case, "" acts just like . in UNIX, and // acts just like ../../.

This doesn’t seem as elegant as UNIX, but it is functional. However, I’m sometimes unable to tell if I actually find UNIX design to be good or if I’m just used to it.

Originally, I was going to write that I couldn’t find this information in the AmigaOS documentation, but as I was finishing up this blog post, I found basically my exact description under the Command Line Characters section (sorry it’s not a direct link). No mention of “relative path” references, though…so it is a bit hard to find.

Azure Site-to-Site VPN and home network integration

2022-03-07T13:30:00-05:00

Introduction

Recently, I’ve had to work pretty extensively with various cloud providers.

Many companies I’ve worked with fully integrate their cloud provider accounts into their existing on-premise network. For this blog, I’m going to call this a cloud “enclave”.

The hosts and services in the cloud enclave are, for all intents and purposes, equal to resources in the “on-premise” data center. VMs in the cloud get IP addresses that are routable on the existing, “on-premise” networks and DNS is configured such that cloud and “on-premise” names roll up under the same domain and are universally resolvable. Most cloud workloads are only accessible via the private network.

There are significant advantages to this “enclave” concept.

Since services in the cloud are not publically accessible, the security “threat-level” is similar to private, on-premise services.
There are not two classes of network/DNS services. Everything is basically on one big happy network.
(Arguably) things are more portable, since there’s not reliance on a cloud provider specific way to expose services.

Creating an “enclave” and bridging it to one’s “on-premise” network is a significantly more complicated activity than most tutorials describe. For a home lab, this may seem like overkill, but I consider this a useful exercise as it more closely resembles the setup I’ve seen in many companies.

This blog post describes the configuring a cloud enclave with Azure. I had three major goals with this:

My cloud enclave must

be routable from my existing home network.
integrate with my existing DNS solution
be automated so it could be created/destroyed on-demand
be minimally disruptive to my existing setup.

Existing Home Lab

First, let’s consider my “On-Premise” home network. I run many services that would be considered essential in a typical corporate network:

DNS
LDAP
Kerberos
Certificate Management

These services are provided by Red Hat Identity Management (IdM), which is a version of FreeIPA bundled with RHEL.

Additionally, I also have DHCP managed by Red Hat Satellite.

My IP address topology is relatively simple. I have a flat /24 as my main network, although I’ve dabbled with splitting out separate subnets and VLANs for things like baseboard management controllers and storage.

For my enclave, I only need to extend DNS and the routable network into the cloud. DHCP will be handled separately by the cloud, and the other services, e.g. LDAP, and Kerberos, will become accessible by virtue of extending DNS and the routable network.

Designing my IP Space

I have used 172.31.0.0/24 as my home network IP range for many years.

When provisioning private cloud networks, I could choose any IP range in the RFC1918 space, so long as it doesn’t overlap with my “on-premise” network.

I decided to extend my routable network to 172.31.0.0/16, which can be divided into 8 /19 networks, each having approximately 8192 IP addresses.

Subnet	Assignment
172.31.0.0/19	On-premise Networks
172.31.32.0/19	Unassigned
172.31.64.0/19	Unassigned
172.31.96.0/19	Unassigned
172.31.128.0/19	Unassigned
172.31.160.0/19	Unassigned
172.31.192.0/19	AWS
172.31.224.0/19	Azure

With this layout, my “on-premise” network can remain the same as it’s always been, since 172.31.0.0/24 is a subnet of 172.31.0.0/19. Additionally, I’ll have the potential to use other subnets of 172.31.0.0/19 in case I ever choose to deviate from a flat network.

With a /19, it’s highly unlikely that I’ll use all of the IP addresses in a given allocation. In my situation, this is ideal, since I will never have to worry about running out of IP addresses. In an “Enterprise” situation, I would be more granular in specifiying things (IPv6 can’t come soon enough!).

Designing DNS

Until now, I’ve had one flat DNS zone for all hostnames in my network: private.opequon.net.

I could continue with one flat zone, but through my experimentation, this seems to be more trouble than it is worth as every cloud host will have to register its hostname with my FreeIPA/IdM server.

Instead, it seems worthwhile to leverage each cloud providers internal DNS and delegate a zone to each cloud provider. This way a VM started in Azure can auto register its hostname, without any on-going effort on my part.

I can then bridge together the zones of my existing “on-premise” network and each cloud zone using DNS forwarders. My existing “on-premise” zone can remain untouched.

What I’ve come up with for my network is:

Zone	Purpose
private.opequon.net	“On premise” Hosts
aws.private.opequon.net	AWS Hosts
azure.private.opequon.net	Azure Hosts

For Azure, there is a problem with this set up and reverse DNS, but I’ll address that in the “Problems and Improvements” section.

Critical Components

With my DNS and IP ranges set, we can move on to the critical components of this “enclave”. While there are many different “bits” that need to be configured, the two critical components that make this work are

Site-to-Site VPN
DNS Fowarding

Site-to-Site VPN w/ Libreswan

Azure offers a Site-to-Site VPN object that uses IPSec tunnels to create a VPN.

For the “on-premise” side of the tunnel, I use the Libreswan implementation that’s bundled with RHEL.

Visually, my configuration looks like this:

DNS Overview.

Azure does not directly support Libreswan, but does offer an Openswan option. Libreswan is a fork of Openswan and their configuration format has diverged somewhat, but it’s a good starting point.

For reference on configuring my RHEL gateway host, Using LibreSwan with Azure VPN Gateway was the only useful blog post I’ve been able to find.

DNS Forwarding

At time of writing, Azure does not have a first-class DNS forwarder service. Therefore, I have to build this component. Fortunately, it is straightforward to build a DNS forwarder using a tiny VM and DNSMasq.

The solution looks like this:

High Level Networking Overview.

In Azure, I create a private zone (azure.private.opequon.net) and link it to my VNET. With that, hosts inside the VNET can query the Azure DNS servers and resolve hostnames in my private zone.

However, hosts outside of the VNET, like my on-premise network, cannot access Azure DNS!

Having the DNS Forwarder VM allows me to, essentially, proxy DNS queries between my on-premise FreeIPA and Azure DNS.

So that my Azure VMs also can resolve on-premise names. The DNS Forwarder VM also sends DNS queries back to FreeIPA for certain zones. To ensure all Azure VMs have this by default, the VNETs primary DNS server is changed from the default Azure DNS to my DNS Forwarder.

Let’s set everything up!

Manual Steps

Most things are covered by the automation, but a few items I had to manually configure due to lack of available automation modules and/or APIs.

Both of these items are one time setup, and neither incure a cloud cost.

Router Setup

My Libreswan endpoint is within my network and thus is behind my router. Therefore, I need to configure my router to forward certain ports to my Libreswan host.

Port	Protocol	Destination
4500	UDP	Libreswan Host
4500	TCP	Libreswan Host
500	UDP	Libreswan Host

If you Libreswan endpoint is at the edge of your network (i.e. it holds a public IP address), then this step is not necessary.

FreeIPA Forward Zones

At time of writing, there are no Ansible modules allow configuration of Forward Zones in FreeIPA/IdM. I may just have missed it, not sure.

For these examples to make sense, here is my relevant configuration:

DNS Forwarder VM IP address: 172.31.225.4
FreeIPA/IdM Server Hostname: ipa.private.opequon.net
Azure subdomain: azure.private.opequon.net

First, I create a DNS Forward Zone in FreeIPA with a forward policy of ‘only’. This ensures that queries for azure.private.opequon.net are forwarded to the DNS Forwarder VM.

[root@ipa ~]# ipa dnsforwardzone-add --forward-policy=only --forwarder=172.31.225.4 azure.private.opequon.net.
Server will check DNS forwarder(s).
This may take some time, please wait ...
ipa: WARNING: DNS server 172.31.225.4: query 'azure.private.opequon.net. SOA': The DNS operation timed out after 10.0006103515625 seconds.
  Zone name: azure.private.opequon.net.
  Active zone: TRUE
  Zone forwarders: 172.31.225.4
  Forward policy: only

The warning about DNS operation timed out will occur if you’re performing this step BEFORE provisioning the entire enclave with the rest of the automation. Ignore it.

Next, to make the forward zone effective, I need a “glue record” to include it in my primary zone. This is just a simple NS record that points back to my IPA server.

[root@ipa ~]# ipa dnsrecord-add private.opequon.net azure --ns-rec=ipa.private.opequon.net.
  Record name: azure
  NS record: ipa.private.opequon.net.

Once the rest of the enclave is configured, this IdM configuration should immediately (or pretty close to immediately) work.

Automation Code

All the code described here is on GitHub

The automation I have for this is written in a combination of Terraform and Ansible.

Two main scripts call the pieces in unison:

build.sh - Creates all cloud infrastructure in Azure and setups Libreswan and the DNS Forwarder
destroy.sh - Tears everything down.

The build.sh and destroy.sh scripts assumes that the Site-to-Site VPN tunnel is being configured on the same host that is running the script. Some modification of the Ansible will be required if you want to run it from a different host.

Sometimes the Azure APIs are very slow and cause the Terraform code to timeout! Luckily, all this code is idempotent, so if that happens, it can just be run again.

Terraform Files

I’ve attempted to format and organize the Terraform files in a way that is easy to understand, rather than strictly to normal.

The main objects are in the following files:

main.tf - Objects for the site-to-site VPN
subnets.tf - Subnets and Network Security Groups
dns.tf - DNS Private Zone
dns_forwarder.tf - DNS Forwarder VM

Then there are more normal Terraform files, that have what you’d expect.

outputs.tf
providers.tf
variables.tf
versions.tf

main.tf

main.tf contains the Site-to-Site VPN resources, and the bare minimum pre-requisites.

azurerm_resource_group defines the resource group. Every Azure resource by this automation will live in this resource group.

azurerm_virtual_network defines our “Azure Virtual Network”, aka VNet, to which we will add subnets. The VNet has variables for the overarching address space as well as DNS servers for everything. In this automation, I’m overriding the default Azure DNS with the statically assigned IP address of our DNS Forwarder.

azurerm_subnet creates the subnet required for the Site-to-Site VPN. This is here, instead of in subnets.tf, because it’s required for the VPN connection. Also, it must be named GatewaySubnet, otherwise dependant resources throw an error.

azurerm_local_network_gateway provides details for our “Local” network, meaning the “On-Premise” side of the Site-to-Site VPN tunnel.

azurerm_public_ip provides a public IP address for the Azure side of the Site-to-Site VPN tunnel.

azurerm_virtual_network_gateway provides the Azure-side of the Site-to-Site VPN tunnel. It links together the Public IP address and the Azure GatewaySubnet.

azurerm_virtual_network_gateway_connection creates the VPN. It links together the azurerm_virtual_network_gateway, azurerm_local_network_gateway, and the VPN Shared Key and creates the VPN configuration on the Azure side. Once this resource has finished provisioning, it is possible to start LibreSwan on the on-premise side and activate the connection.

subnets.tf

subnets.tf contains all azurerm_subnet definitions, except for the GatewaySubnet.

In the public repo, I have just one subnet for my “main” network. In my personal lab, I add separate subnets for OpenShift and other things I run in the cloud.

dns.tf

dns.tf does some minimal DNS work.

azurerm_private_dns_zone actually creates our private DNS zone in Azure.

azurerm_private_dns_zone_virtual_network_link then links that DNS zone to our VNet. I have the registration_enabled flag set to true, so all Virtual Machines connected to the VNet will automatically have their hostnames registered in our private zone.

dns_forwarder.tf

dns_forwarder.tf defines the Virtual Machine that will be our DNS forwarder.

azurerm_network_interface defines it’s network interface and, importantly, its static IP within the VNet.

azurerm_linux_virtual_machine is the virtual machine definition. Currently uses a RHEL 8 VM, but this could be changed to almost any RHEL-like VM and still work.

azurerm_network_security_group is a simple network security group for the instance.

Ansible Playbooks

There are three Ansible playbooks. The build.sh and destroy.sh scripts pass the outputs of the Terraform build into these, so they have no separate inventory or variable files.

azure_dns_forwarder.yaml - Configures the DNS forwarder
azure_ipsec.yaml - Configures on localhost, the on-premise side of the VPN tunnel
azure_ipsec_remove.yaml - Destroyes the on-premise side of the VPN tunnel

Sample terraform.tfvars

A sample terraform.tfvars is included.

# What Azure Region to use for all resources
azure_region_name="eastus"

# Resource group for all resources
resource_group_name="ENCLAVE-EAST"

# Default name for most resources
default_resource_name="ENCLAVE-EAST"

# Address Space and DNS servers for VNET
vnet_address_space=["172.31.224.0/19"]
vnet_dns_servers=["172.31.225.4"]

# Subnet for the Azure side of the Site-to-Site VPN
gateway_subnet=["172.31.224.0/24"]

# Names for Local Network Gateway and resources related to
# the on-premise side of the Site-to-Site VPN
local_network_gateway_name="ONPREM"
on_premise_name="ONPREM"

# On-Premise network range and public IP.
on_premise_private_network_ranges=["172.31.0.0/19"]
on_premise_public_ip_address="108.56.139.185"

# DNS Zone for Azure Private DNS
private_dns_zone_name="azure.private.opequon.net"

# Static Private IP address of DNS Forwarder.
dns_forwarder_ip_address="172.31.225.4"

# Name for thePublic IP address for the Azure side of the VPN
vpn_azure_public_ip_name="VPN-EAST"

# Name and range of main subnet in Azure
subnet_main_name = "EAST-MAIN"
subnet_main_cidr_ranges = ["172.31.225.0/24"]

# DNS information for DNS Forwarder VM
on_premise_dns_zones = ["private.opequon.net", "0.31.172.in-addr.arpa"]
on_premise_dns_server = "172.31.0.101"
# The Azure Private DNS server is a static IP address
# See:  https://docs.microsoft.com/en-us/azure/virtual-network/virtual-networks-name-resolution-for-vms-and-role-instances#considerations
azure_private_dns_server = "168.63.129.16"
# The reverse DNS doesn't work, currently, for on-prem resolving Azure PTR records.
vnet_dns_reverse_zones = ["225.31.172.in-addr.arpa"]

# Shared Key for VPN
# DO NOT CHECK THIS INTO GIT, it's only here to show what the variable name is!
vpn_shared_key="mysecretkey"

Problems and Improvements

There are still problems with this. Maybe I will solve and update when I have the time.

Lack of security groups

This configuration is pretty wide open internally. I’m doing very little segregation of Azure subnets that would be common in an Enterprise organziation.

I still feel pretty safe with this configuration, because it’s only really accessible from my home network, but if using this as a pattern for something more serious. It’s worth considering additional network segmentation.

Reverse DNS

Try as I might, I cannot get Reverse DNS forwarding to Azure private DNS working with FreeIPA/IdM.

When I set up a forward zone in FreeIPA/IdM, it expects that the destination DNS to have a SOA record equivalent to the forward zone. So, if my forward zone is 225.31.172.in-addr.arpa, then the DNS server I’m forwarding to is expected to have an SOA record for 225.31.172.in-addr.arpa.

Azure DNS appears to dump all PTR records into a big in-addr.arpa zone. So, there are no appropriate SOA records and FreeIPA/IdM refuses to set up the forward zone.

Other DNS implementations, like dnsmasq, can cope with this. I suspect because they textually parsing the DNS query at the server before sending it on, but I’m not entirely sure. A bit more research is required.

At any rate, Reverse DNS does not automatically work with this solution, which is a problem.

Plain text VPN key

This example just has the shared VPN key as plain text. For a home lab, this is mostly acceptable, although still not ideal. For an enterprise solution, this VPN key should be vaulted and highly protected.

Routing weirdness

In this example, the IPSec tunnel is not at the edge of the network, but is instead inside my private network. My consumer router forwards certain ports to the IPSec host in order to make the connection work.

When I’m on the road (a rarity these days), I use the VPN client/server provided by my consumer router, and there are routing difficulties when trying to use resources in the enclave when I’m connected to my router’s VPN.

Fedora Copr and Custom Gem RPMs for Fedora

2022-01-11T17:42:00-05:00

This blog is formatted using Jekyll and I use the jekyll-redirect-from plugin to do URL redirection, so that old blog posts migrated my former blog platform can still be accessed via the same URL.

My regular desktop operating system and Jekyll has been packaged by Fedora for some years. However, most plugins, including jekyll-redirect-from, are NOT packaged by Fedora.

In the past, this has led me to have to muck about with other package tools, like gem or pip. I don’t like it. They put stuff in weird places, duplicate system provided dependencies and just cause a mess.

No…call me stubborn, but I’d just like a nice simple RPM packages. Thank you very much.

Enter Fedora Copr. Copr is a front-end to Fedora’s package build system that allows one to build custom RPM packages and host them in a personal, publicly-accesible, yum repositories.

Copr has been around for a while, and I play around building some Emacs packages about 4 years ago, but it’s become substantially easier.

In just a few commands, I can have the Jekyll plugin I want, or really most Gems, converted to an RPM and available for me to install.

Setup

First, in order to use Copr, a Fedora Account is required. Signing up for one is free and relatively easy.

Next, I’d highly recommend getting the copr-cli package installed on your Fedora machine. It’s part of the default Fedora repositories, so it should be as simple as dnf install copr-cli.

Almost all of the tasks could be done through the Copr Web UI, but in this post, I’m using copr-cli.

Next, populate the Copr token by visiting https://copr.fedorainfracloud.org/api/

Put the output there into ~/.config/copr. It should look something like this:

[copr-cli]
login = gibberish
username = your_user_name
token = gibberish
copr_url = https://copr.fedorainfracloud.org
# expiration date: 2022-07-09

copr whoami should now return your Fedora Account name:

[chale@work-wired ~]$ copr whoami
xlark

Creating a project

The first step is to create a project. This acts as your yum repository. It’s possible to create multiple projects, if desired.

For example, I have an emacs project for Emacs related packages, and I created a jekyll-gems project for all the Jekyll Gems I intend to use:

To create a project named jekyll-gems, run the following command:

[chale@work-wired ~]$ copr-cli create --chroot fedora-35-x86_64 jekyll-gems
New project was successfully created.

The --chroot fedora-35-x86_64 specificies the package build environment. There are many different options. Fedora 35 seemed reasonable to me, since I’m currently using it, although rawhide may be a more future proof selection.

Building the Package

Now that we have a project, building a package from a Gem file in just one command:

[chale@work-wired ~]$ copr-cli buildgem --gem jekyll-redirect-from jekyll-gems
Build was added to jekyll-gems:
  https://copr.fedorainfracloud.org/coprs/build/3144944
Created builds: 3144944
Watching build(s): (this may be safely interrupted)
  17:17:28 Build 3144944: pending
  17:17:58 Build 3144944: running
  17:20:28 Build 3144944: succeeded

This builds an RPM from the Gem, jekyll-redirect-from into the project jekyll-gems.

Since, jekyll-redirect-from is hosted on RubyGems, Copr pulls the Gem file directly from there without any additional information required.

After the build has succeeded, the package is ready to install!

Installing the Package

First enable your Copr repository. The name should always be in the form of “Fedora Account/Project Name”:

[chale@work-wired ~]$ sudo dnf copr enable xlark/jekyll-gems 
Enabling a Copr repository. Please note that this repository is not part
of the main distribution, and quality may vary.

The Fedora Project does not exercise any power over the contents of
this repository beyond the rules outlined in the Copr FAQ at
,
and packages are not held to any quality or security level.

Please do not file bug reports about these packages in Fedora
Bugzilla. In case of problems, contact the owner of this repository.

Do you really want to enable copr.fedorainfracloud.org/xlark/jekyll-gems? [y/N]: y
Repository successfully enabled.

After that, we can just install the package like normal! By Fedora standards, all RPMs for Gems are always prefixed by “rubygem-“

[chale@work-wired ~]$ sudo dnf install rubygem-jekyll-redirect-from
Copr repo for jekyll-gems owned by xlark                                                              15 kB/s | 2.6 kB     00:00    
Dependencies resolved.
=====================================================================================================================================
 Package                            Architecture Version                Repository                                              Size
=====================================================================================================================================
Installing:
 rubygem-jekyll-redirect-from       noarch       0.16.0-1.fc35          copr:copr.fedorainfracloud.org:xlark:jekyll-gems        14 k

Transaction Summary
=====================================================================================================================================
Install  1 Package

Total download size: 14 k
Installed size: 9.8 k
Is this ok [y/N]: 

Projects and their corresponding repositories created in Copr are public. Anyone running Fedora can enable your repository using dnf copr enable.

Others can benefit from whatever you package using Copr, so if you package something useful, feel free to share with others!

Resurrecting the VAXen, part 3: Hardware Repairs

2021-05-29T17:30:00-04:00

This post is part of an on-going, multi-decade series on my half-hearted attempts to get all my VAX hardware up and running.

Start and stop is the nature of a lot of personal projects. I started to get my VAXen up and running several years ago and then, had to stop for other life reasons.

My condo was remodeled and I’m having to reassemble my lab, so it seems like a good time to restart the project.

First thing first was testing all the VAXen on taking them out of storage. All booted and got to the boot monitor, except for my VAXstation 4000/90. For the entire time I’ve owned it, this poor VAX has never booted and I never dedicated the time to diagnosing it. This is my most powerful VAX, so I’ve always wanted to restore it to a working state.

Turns out, there were two major issues with this VAX:

A Damaged SIMM socket, presumably preventing memory
A dead clock battery

VAXen have diagnostic lights to assist in boot up diagnostics. This VAX had all eight lights lit steady on bootup, and would not progress from there.

There are eight lights!

According to the VAXArchive, this means “Power is applied, but no instruction is executed”; essentially, no instructions have been executed by the CPU. During normal startup, this is the first light combination displayed, but a healthy VAX quickly progresses to other light sequences.

Staying in this state, per my research, generally indicates a memory issue or an issue with the real time clock.

First, I experimented with the memory. Maybe I had a bad module? Luckily, my VAXstation 4000/60 is functional and takes the same SIMMs as the 4000/90.

To my knowledge there are 2 sizes of SIMM modules available for VAXstation 4000s: 4MB and 16MB. These are fairly easy to distinguish based on the labels affixed to them. AA mean 4MB, CA means 16MB

SIMM Differences.

For the 4000/90, these need to be installed in sets of four. The 4000/90 has 8 slots and no on-board RAM, where as the 4000/60 has 6 slots and 8MB of onboard RAM.

The 4000/60 and the 4000/90 are different in how SIMMs must be installed and the differences are not immediately intuitive. Fortunately, the service manual is available in the MANX Archive.

For the 4000/60, matching pairs must be installed with the lower capacity SIMMs in lower numbered banks.

For the 4000/90, matching sets of 4 must be installed in a staggered pattern.

4000/60 vs 4000/90 SIMM Arrangement.

All of the RAM from the 4000/90 tested good in the 4000/60, so all of the SIMM modules were good.

My 4000/90 is “fully loaded”, and it has always had all SIMM spots occupied. So, knowing that the SIMMs were good, I loaded only one set of four.

On power up, I got the boot monitor!

This is the first time this machine has booted in the decade I’ve owned it!

The culprit here was a damaged SIMM socket, it’s very difficult to get a picture of this, but I’ve tried.

SIMM Damage.

I’ve ordered, what may be a replacement on Digi-Key.

Now, this solved some of the problems, but it was not always reliably booting. If I started it completely cold, it would still be stuck with all diagnostic LEDs illuminated. However, a quick power cycle after a few minutes of running would reliably get the machine to the boot monitor.

This has to be the real time clock, as I’ve seen reports that a dead real time clock will cause this symptom. The battery is, I presume, the original and does not hold a charge. Interestingly enough, the 4000/60 does not seem to be affected by this, even though it’s a similar, but not identical, system board.

Most DEC products of this era use the DS1287 RTC chip. This chip contains both the circuitry for the clock, as well as a built in battery. These have been out of production for years, and many people have resorted to hacking them to replace the built in battery.

DS1287A and Compatibles.

I can’t be arsed to do something like that, especially because there are plug-compatible parts stil being produced. The DS12887+ and DS12887A+ are produced by Maxim Integrated Products, and I’ve used them before in PCs and other systems that use the DS1287. I figured I might as well replace ALL of my DS1287’s so I’ve ordered several from Digi-Key At over $10 a pop, they are not as economical as coin cells, but it beats having to do some time consuming modification to the either the DS1287 chip or the board.

The last remaining issue is that the SCSI2SD card that I’m using with the VAXes always throws an error on boot. I can later issue boot dka0 from the system monitor, but this error prevents the VAX from auto-booting.

?? 001  10      SCSI  0048

Currently, I can’t seem to find any information for this error, but that’s something for another day.

OpenShift Certificate from IPA on RHEL 8

2020-11-04T11:53:56-05:00

OpenShift and IPA Series

This post is part of a collection of blog posts related to OpenShift and FreeIPA (aka idM).

OpenShift Certificate from IPA on RHEL 8
OpenShift authentication with IPA
OpenShift Group Syncing with IPA (Not yet published)
Automated Certificate Management with IPA and cert-manager (Not yet published)

Introduction

I’m rebuilding my lab with a focus on OpenShift 4 and during this rebuild, I’m working on “modernizing” several aspects of my configuration. This includes attempting (again) to integrate in FreeIPA for managing hosts, certificates and other items.

OpenShift 4’s default node operating system, CoreOS, does not support being directly attached to FreeIPA in the way a traditional Linux host would be. But, OpenShift 4 still requires certain certificates as well as can integrate with an LDAP provider for authentication and authorization.

This post describes creating the certificates for the ingress controller and API endpoints for OpenShift. This isn’t nearly as straightforward as it would seem.

For this example, I’m using the version of FreeIPA included with RHEL 8, which has an official product name of Identity Management (idM). For this article, I’ll be calling everything FreeIPA to keep it generic.

Environment Description

First, let’s look at the environment. This process requires only a few pieces of information:

FreeIPA base domain: private.opequon.net
OpenShift base domain: ocp4.private.opequon.net
API and Ingress Controller IP Address: 172.31.0.120

Creating the Certificates

Creating DNS Zones and Host Entries

First create the ocp4.private.opequon.net zone. This will be the base name of my cluster.

ipa dnszone-add ocp4.private.opequon.net --admin-email=admin@private.opequon.net

Next create entries for each required OpenShift hostname. apps must be created, even though only the wildcard is used, because it’s used to create the certificate later on.

ipa dnsrecord-add ocp4.private.opequon.net api --a-rec=172.31.0.120
ipa dnsrecord-add ocp4.private.opequon.net api-int --a-rec=172.31.0.120
ipa dnsrecord-add ocp4.private.opequon.net apps --a-rec=172.31.0.120
ipa dnsrecord-add ocp4.private.opequon.net *.apps --a-rec=172.31.0.120

Notice, I’m not creating a reverse entries here as it is not specifically required and since 172.31.0.120 is shared between the API and Ingress Controller, I’m not sure what I’d want it to be. In a real production situation, I would likely want reverse entries for everything.

Finally, create the host principal.

ipa host-add apps.ocp4.private.opequon.net
ipa host-add api.ocp4.private.opequon.net

Wildcard Profile

With FreeIPA, certificates are generated based on a profile and the default profile is acceptable for most normal certificates, like our api certificate.

By default, FreeIPA does not include a profile for creating wildcard certificates and much of the documentation has warnings around wildcard certificates being deprecated. While this may be true in some sense, wildcard certificates are still very much a part of the OpenShift installation.

In order to support wildcard certificates, we need to create a new certprofile, which defines how these certificates should be created. This profile will take care of prefixing the certificate’s Common Name and Subject Alternate Names (SANs) with *..

To define a new profile, first, we extract the default certprofile:

ipa certprofile-show caIPAserviceCert --out wildcard.cfg

Next we need to modify the cert profile to automatically prefix the Subject and SAN fields with *.. The instructions for how to do this were found on Fraser Tweedale’s blog and are an extension of Documenation on the FreeIPA wiki.

The diff below shows the changes needed to the wildcard.cfg file:

[root@ipa ocpcerts]# diff wildcard.cfg wildcard.cfg.orig 
19c19
< policyset.serverCertSet.1.default.params.name=CN=*.$request.req_subject_name.cn$, O=PRIVATE.OPEQUON.NET
---
> policyset.serverCertSet.1.default.params.name=CN=$request.req_subject_name.cn$, O=PRIVATE.OPEQUON.NET
32,40c32,33
< policyset.serverCertSet.12.default.class_id=subjectAltNameExtDefaultImpl
< policyset.serverCertSet.12.default.name=Subject Alternative Name Extension Default
< policyset.serverCertSet.12.default.params.subjAltNameNumGNs=2
< policyset.serverCertSet.12.default.params.subjAltExtGNEnable_0=true
< policyset.serverCertSet.12.default.params.subjAltExtType_0=DNSName
< policyset.serverCertSet.12.default.params.subjAltExtPattern_0=*.$request.req_subject_name.cn$
< policyset.serverCertSet.12.default.params.subjAltExtGNEnable_1=true
< policyset.serverCertSet.12.default.params.subjAltExtType_1=DNSName
< policyset.serverCertSet.12.default.params.subjAltExtPattern_1=$request.req_subject_name.cn$
---
> policyset.serverCertSet.12.default.class_id=commonNameToSANDefaultImpl
> policyset.serverCertSet.12.default.name=Copy Common Name to Subject Alternative Name
119c112
< profileId=wildcard
---
> profileId=caIPAserviceCert

After updating the wildcard.cfg file, it needs to be imported into IPA, a profile created, and appropriate hosts (in this case apps.ocp4.private.opequon.net) associated to that profile.

ipa certprofile-import wildcard --file ./wildcard.cfg --desc 'Wildcard certificates' --store 1

ipa caacl-add-profile wildcard-hosts --certprofiles wildcard

ipa caacl-add-host wildcard-hosts --hosts apps.ocp4.private.opequon.net

Referenced in the documentation is the need to add the ipa certificate authority to that profile. I’m not sure if this is actually needed.

ipa caacl-add-ca wildcard-hosts --cas ipa

Generating CSRs

Once the wildcard profile is in place, a certificate signing request (CSR) must be created for each domain.

In the CSR for the *.apps.private.opequon.net, it’s important to note that the CSR’s Common Name does not include the *. portion, just apps. The wildcard profile created in the previous section will prefix *. to the certificate it generates based on the CSR.

# openssl req -newkey rsa:4096 -keyout apps.key -out apps.csr

Generating a RSA private key
.........................................................................................................................................................++++
.............................................................++++
writing new private key to 'apps.key'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:US
State or Province Name (full name) []:VA
Locality Name (eg, city) [Default City]:Reston
Organization Name (eg, company) [Default Company Ltd]:Opequon Networks
Organizational Unit Name (eg, section) []:OpenShift
Common Name (eg, your name or your server's hostname) []:apps.ocp4.private.opequon.net
Email Address []:admin@private.opequon.net

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

This creates a csr file and an encrypted key, which we will later decrypt for loading into OpenShift.

I generally rename the keyfiles to indicate that they are encrypted:

mv -iv apps.key apps.enc.key

We will want generate a CSR for both apps.ocp4.private.opequon.net and api.ocp4.private.opequon.net

Generating Certificates

The csr file is input to the ipa cert-request command. For the *.apps domain we specify the --profile wildcard option to generate the certificate with our wildcard profile.

# ipa cert-request apps.csr --principal host/apps.ocp4.private.opequon.net --profile wildcard
  Issuing CA: ipa
  Certificate: MIIFvTCC
  Subject: CN=*.apps.ocp4.private.opequon.net,O=PRIVATE.OPEQUON.NET
  Issuer: CN=Certificate Authority,O=PRIVATE.OPEQUON.NET
  Not Before: Mon Nov 02 17:55:28 2020 UTC
  Not After: Thu Nov 03 16:55:28 2022 UTC

The api certificate is generated in the same fashion, but without the --profile wildcard flag.

ipa cert-request api.csr --principal host/api.ocp4.private.opequon.net

I have not found a good way to actually extract the certificate PEM file from IPA, so I just copy the output of ipa cert-request into a textfile (e.g. apps.crt, but the name doesn’t really matter) in the normal PEM format:

-----BEGIN CERTIFICATE-----
MIIFvTCC
-----END CERTIFICATE-----

Finally, when creating the CSRs, openssl requires the key file be encrypted with a password. OpenShift (and most other users of SSL) expect the key file to be unencrypted. So to decrypt the key, use

openssl rsa -in apps.enc.key -out apps.key

A final item to gather is the IPA certificate, which can be found in /etc/ipa/ca.crt on any host enrolled to the FreeIPA server.

At the conclusion of this process, I should have a set of certificates, key files, CSR files, and the IPA Certificate Authority.

[root@yavanna ssl]# ls -la
-rw-r--r--.  1 root root 2071 Nov  2 13:47 api.crt
-rw-r--r--.  1 root root 1789 Nov  2 13:46 api.csr
-rw-------.  1 root root 3414 Nov  2 13:45 api.enc.key
-rw-------.  1 root root 3243 Nov  2 13:50 api.key
-rw-r--r--.  1 root root 2119 Nov  2 13:44 apps.crt
-rw-r--r--.  1 root root 1793 Nov  2 13:43 apps.csr
-rw-------.  1 root root 3414 Nov  2 13:42 apps.enc.key
-rw-------.  1 root root 3243 Nov  2 13:50 apps.key
-rw-r--r--.  1 root root 1667 Nov  2 13:51 ca.crt

Loading New Certificates into OpenShift

Now that we have gathered all the materials, we can load these into an OpenShift cluster. There are three primary steps.

Loading the Certificate Authority

Since my FreeIPA Certificate Authority is self generated, it’s not included in the default CoreOS certificate bundle. In order to ensure that all CoreOS nodes trust my Certificate Authority, it must be loaded into the cluster-wide proxy configuration.

Confusingly enough, this step is included with the documentation for Replacing the default ingress certificate although it’s not directly related to that action.

First create a configmap containing the root CA which we gathered in the previous section:

oc create configmap custom-ca \
     --from-file=ca-bundle.crt= \
     -n openshift-config

The name of this configmap is custom-ca. The name of the configmap can be changed but it must match the the next patch command!

Next, patch the cluster-wide proxy to include this certificate authority.

oc patch proxy/cluster \
     --type=merge \
     --patch='{"spec":{"trustedCA":{"name":"custom-ca"}}}'

Again, if the ConfigMap was created with a different name than custom-ca then this patch command must be updated to match the name of the ConfigMap!

After applying this patch, the MachineConfigOperator will update all nodes in the cluster. Certain nodes may be down, and it may be desirable to wait to peform the next steps until all nodes are back up and the ClusterOperators all are in good state.

Loading the Ingress Controller Certificate

See Replacing the default ingress certificate for the official documentation.

To load the Ingress Controller, first we create a certificate chain file which contains the *.apps.ocp4.private.opequon.net certificate, then any intermediate certificates and finally the certificate authority. My simple FreeIPA set up does not have any intermediates, so we can just concatenate the apps.crt and ca.crt files together.

cat apps.crt ca.crt > appsChain.crt

Do be careful about line endings. It’s important that when certificates are contatenated together that each certificate begins on it’s own line. For example:

GOOD:

-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----

BAD:
-----END CERTIFICATE----------BEGIN CERTIFICATE-----

Once the chain file is created, then we can use it with the key file to create a secret. This example names the secret custom-ingress-secret. As in the prior section, the name of this secret can be changed, but it must be consistent between all commands!

oc create secret tls custom-ingress-secret \
     --cert= \
     --key= \
     -n openshift-ingress

With the secret in place, then we can patch the IngressController object:

oc patch ingresscontroller.operator default \
     --type=merge -p \
     '{"spec":{"defaultCertificate": {"name": "custom-ingress-secret"}}}' \
     -n openshift-ingress-operator

Again, it’s very important that the name of the secret match in the patch command!

This should cause Ingress Controller pods to be recreated in the openshift-ingress project. Once these new pods are ready, the endpoint should be using this new certificate.

Loading the API Certificate

See Adding API server certificates for the official documentation.

Loading the API certificate is the most dangerous part of this activity. If done incorrectly, then the apiservers can become unavailable. Luckily, if this is the case, then the OpenShift web console will still be available via the ingress controller.

The overall process is very similar to updating the Ingress Controller certificate.

To start, we again need to create a certificate chain:

cat api.crt ca.crt > apiChain.crt

Next we need to create a secret again:

oc create secret tls custom-api-certificate \
     --cert= \
     --key= \
     -n openshift-config

Finally, we need to patch the APIServer object:

oc patch apiserver cluster \
     --type=merge -p \
     '{"spec":{"servingCerts": {"namedCertificates":
     [{"names": ["api.ocp4.private.opequon.net"], 
     "servingCertificate": {"name": "custom-api-certificate"}}]}}}' 

Again, the name of the secret must match what is put into the above patch. Additionally, notice that we must specify the hostname associated with this certificate.

This will cause the API pods on the OpenShift masters to restart with the new certificate. During such time, there may be small outages of the API depending on the load balancer setup.

Important to note that after applying this change, the kubeconfig file generated during installation may not work or may throw certificate errors. The kubeadmin username and password should continue to work.

Next Steps

After loading certificates, the next step would be to integrate FreeIPA’s LDAP as an authentication provider. This topic is covered in the next part.

OpenShift Certificate from IPA on RHEL 8
OpenShift authentication with IPA
OpenShift Group Syncing with IPA (Not yet published)
Automated Certificate Management with IPA and cert-manager (Not yet published)

Clark Hale’s Blog

K8s PVs, NFS and ACLs and avoiding umask changes.

The Setup

WHY would I need to do this?

Renaming RAID Devices

Fixing Fedora’s default FreeIPA config

OpenShift authentication with IPA

OpenShift and IPA Series

Introduction

OpenShift Background

Name

bindDN and bindPassword

ca

url

An Aside: Restricting Access

attributes

mappingMethod

Putting it all together

Pre-requisites

Service Account

Certificate Authority

OpenShift Configuration

Creating Secrets and ConfigMaps

Creating OAuth Object

Testing Logging in

Next Steps

Removing labels or annotations from K8s objects using Ansible

Relative Paths in AmigaDOS Commands

Azure Site-to-Site VPN and home network integration

Introduction

Existing Home Lab

Designing my IP Space

Designing DNS

Critical Components

Site-to-Site VPN w/ Libreswan

DNS Forwarding

Let’s set everything up!

Manual Steps

Router Setup

FreeIPA Forward Zones

Automation Code

Terraform Files

main.tf

subnets.tf

dns.tf

dns_forwarder.tf

Ansible Playbooks

Sample terraform.tfvars

Problems and Improvements

Lack of security groups

Reverse DNS

Plain text VPN key

Routing weirdness

Fedora Copr and Custom Gem RPMs for Fedora

Setup

Creating a project

Building the Package

Installing the Package

Sharing your project

Resurrecting the VAXen, part 3: Hardware Repairs

OpenShift Certificate from IPA on RHEL 8

OpenShift and IPA Series

Introduction

Environment Description

Creating the Certificates

Creating DNS Zones and Host Entries

Wildcard Profile

Generating CSRs

Generating Certificates

Loading New Certificates into OpenShift

Loading the Certificate Authority

Loading the Ingress Controller Certificate

Loading the API Certificate

Next Steps