Onboarding

tl;dr

1 matrix

in matrix (element) go to room 'Ask the bot'
enter:

!on <firstname> <lastname> <email>

there should be no space in firstname and lastname

also not in email (obviously)
use _ instead
you need to be a registered onboarding agent
THe is a longer form !onboarding firstname:/ lastname:/ email:/

2 create_openproject_member_tasks

The dagster asset 'create_openproject_member_tasks' then creates entries in Openproject.

TODO: Schedule the asset

3 specification

In Openproject the entries need to be reviewed and specified (which accounts, username etc.)
The Status needs to be set to 'scheduled' for the job to be picked up.

4 create_user_accounts

The dagster asset 'create_user_accounts' then uses the openproject entries to create the specified user accounts.

There could be an auto create option when entries are create by matrix (secure channel)

Goal:

creating a scalable , highly automated process to onboard new members

convenience (in between 2 🍺 in Tati)
security
- only authorised user can create account
- overview of account details
multiple origin

Out of Scope

remaining manual tasks (for now):

creation of unix accounts (for now)
github (needed ?)
xwiki (for now)
onboarding via MCP Agent (usability?)

Tech Stack

A Plone Grafitti instance is used to setup basic onboarding information (@maikroeder)
Minimal csv File as starting point
Nextcloud is used to host onboarding_user.csv
Matrix is used to host an onboarding bot
CouchDB is used to store User Data
Dagster is used to orchestrate data pipelines
Openproject is used to manage tasks and status

Alternative agents

onboarding can be initiated by various agents:
Write a minimal csv file or write a json string to couchDB

Endresult is a minimal entry in couchDB

Process Pipelines

--- 
config: 
  theme: 'forest' 
---
sequenceDiagram
	autonumber
	participant matrix as Matrix Bot
	participant csv as CSV File
	participant op as OP Member Tasks
	actor admin as Onboarding Admin
	participant next as NX user data
	participant opu as OP user data
	participant couch as CouchDB
	par
		matrix ->> couch: initialisation
	and
		rect rgb(240, 240, 244)
			note right of csv: Import Pipeline
			csv ->> couch: import
		end 
	end
	rect rgb(255,244,245)
		note right of op: create member task pipeline
all docs without member_id
		couch ->> op: create member task
		op ->> couch: write member_id
	end
	admin ->> op: specify task
	admin ->> op: schedule task
	rect rgb(244,244,255)
		note right of op: create account pipeline
		par
			op->>next: create account
			next->>op: write comment
			next->>couch: write user_info 
		and
			op->>opu: create account
			opu->>op: write comment
			opu->>couch: write user_info
		end
	end
	rect rgb(244, 255, 244)
		note right of op: Consolidation Pipeline
		op->> couch: update task account data
		next->> couch: update account data
		opu->> couch: update account data
		couch ->> op: consolidate task account data
	end

Process Initiation

is done by writing a minimal csv file firstname,lastname,email,(username)
Onboarding Apps:

[join.openheidelberg.de](https://join.openheidelberg.de provides a plone based application to create datasets of onboarding candidates. (needs more elaboration )
CRM or similar application
Authorized nextcloud user

Note

The Onboarding App can

write to nextcloud filesystem on Neuenheim Server or
upload file to nextcloud
the shared Folder 'Admin' is used

Minimal data

the process shall work with minimal input data
when username is missing it is first letter of first name followed by last name.
when last name is also missing it is name part of email address.

tbd

We could leave the username empty and rely on Authorization Task to provide a username
It might be easier on the further processing to make last name required as it is used in generating a document id in couchdb.

Import Pipeline

recurring task to look for onboarding.csv on nextcloud
iterating over entries
incomplete entries are ignored
check if a matching doc exist in couchdb
import if entry is new
inform/log incomplete entries

tbd

it might be easiest to let onboarding.csv just grow
deleting processed entries or writing a status column could introduce file locking issues and conflicts.
This might also necessitate each app to have its own onboarding.csv app using app specific subfolder.

flowchart LR
csv[[CSV File]] --> import>import Pipeline]
import --> couch[(CouchDB)]

CouchDB Initiation

Apps can forego the csv file and write to couchDB directly.
This is the preferred method for apps that can do so. An example for this approach is the Matrix oh-bot

A couchDB doc can be created by sending an onboarding command to our oh-bot
/onboarding email:<emailaddress> name:<firstname> [<lastname>] [username: <username>]
the oh-bot checks the senders id
only a defined collection of matrix users are authorized

initial couchDB doc

Openproject Pipeline

dagster pipeline collects all docs without a member_id key (this identifies docs as unprocessed)
An onboarding task is created in the openproject onboarding project
status is set to "In specification"
key member_id (id of onboarding task) is added to couchDB doc (this marks the doc as processed)
assigned Onboarding Admin is notified by openproject

flowchart LR
couch[(couchDB)] --> pipe>create_openproject_member_tasks]
pipe --> opemproject[(openproject member task)]

Openproject Workflow

In specification Status means user data needs augmentation.
In progress Status means everything is fine and user has working accounts.
Scheduled Status means begin of automated processing
Considering to use Status 'closed' for deleting accounts.

flowchart LR
start(((start)))-->isp
isp(In specification)-->assign>complete task specification]
assign-->sp(specified)
assign-->tsd(to be scheduled)
assign-->sd(scheduled)
sp-->tsd
tsd-->sd
sd-->cap>create accounts pipeline]
cap-->success{Success?}
success-->|yes|ip(in progress)
success-->|no|isp
ip-->cp>consolidation pipeline]
cp-->success
ip-->dev(developed)
ip-->stop(((end)))
dev-->stop

Dagster

Screenshot

Authorization Task

missing information in onboarding task is specified
accounts to be created defined (nextcloud, openproject ..)
status is set to "scheduled"
only Project Admins are allowed to do so (openproject workflow)

Note

lastname shall be added if empty
this assures a more consitent onboarding
a username can be set instead of default one

A task can be edited at a later time to grant additional accounts (skills, interests)
This can be done by the user after onboarding

Create Accounts Pipeline

A dagster pipeline fetches all tasks of status ‚scheduled'
creates accounts, roles etc
status ‚developed' is set to task
openproject and nextcloud send invitation mail to onboarded user

Accumulated Data

Consolidation Pipelines

the consolidation pipeline takes into account that accounts can be created outside of the onboarding process.

update. couchdb
get nextcloud user data
get openproject user data
update member tasks
When inconsistencies are found a comment is written to the member task

Update CouchDB

All CouchDB Entries are iterated
Member Task is gathered by member_id
no member_id means entry was not processed yet
define what to do on member task not found
write member task info to couchdb

Openproject User Data

All Openproject Users are iterated
a doc is searched by {'openproject': 'openproject_id': id}
user information is written to doc[openproject]
define what to do if no or multiple docs are found

Nextcloud User Data

all nextcloud users are iterated
a doc is searched by {'nextcloud': 'nextcloud_id': id}
information is written to doc[nextcloud]
define what to do if no or multiple docs are found

Update Member Tasks

All member tasks are iterated
a doc is searched by member_id
if a single doc is found doc info is written to member task and status is set to "In progress"
elif no or multiple docs are found a comment is written and Status is set to "In specification"

we should not write a comment when member task status is "In specification"

Onboarding

An invitation mail is send to new users.
The new users are directed to the polls section.
There they can choose an available onboarding event.
Onboarding will be done via jitsi and in realspace.

Revoking Accounts

Accounts not used for a defined period shall be revoked.

a mail is send to the user to inform of pending account revocation.
accounts can be set to inactive for a grace period
after that accounts are deleted to free up resources.

Resources:
https://github.com/rawdlite/mcp-openheidelberg
https://github.com/rawdlite/dg-openheidelberg
https://docs.dagster.io/deployment/oss/deployment-options/deploying-dagster-as-a-service
https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/blob/main/Junie/Prompt.txt

Onboarding

tl;dr

1 matrix

2 create_openproject_member_tasks

3 specification

4 create_user_accounts

Goal:

Out of Scope

Tech Stack

Alternative agents

Process Pipelines

Process Initiation

Minimal data

Import Pipeline

CouchDB Initiation

initial couchDB doc

Openproject Pipeline

Openproject Workflow

Dagster

Authorization Task

Create Accounts Pipeline

Accumulated Data

Consolidation Pipelines

Update CouchDB

Openproject User Data

Nextcloud User Data

Update Member Tasks

Onboarding

Revoking Accounts

Resources: https://github.com/rawdlite/mcp-openheidelberg https://github.com/rawdlite/dg-openheidelberg https://docs.dagster.io/deployment/oss/deployment-options/deploying-dagster-as-a-service https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/blob/main/Junie/Prompt.txt

Resources:
https://github.com/rawdlite/mcp-openheidelberg
https://github.com/rawdlite/dg-openheidelberg
https://docs.dagster.io/deployment/oss/deployment-options/deploying-dagster-as-a-service
https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/blob/main/Junie/Prompt.txt