WO2024256054A1

WO2024256054A1 - Enabling recommendation analytics in a wireless communication network

Info

Publication number: WO2024256054A1
Application number: PCT/EP2024/059341
Authority: WO
Inventors: Konstantinos Samdanis; Emmanouil Pateromichelakis; Dimitrios Karampatsis
Original assignee: Lenovo Singapore Pte Ltd
Current assignee: Lenovo Singapore Pte Ltd
Priority date: 2024-02-28
Filing date: 2024-04-05
Publication date: 2024-12-19
Anticipated expiration: 2026-08-28

Abstract

Various aspects of the present disclosure relate to a method (800) performed by an analytics network function, comprising: receiving (802), from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determining (804), using an analytics algorithm, one or more recommendation actions based on the first request; transmitting (806), to the consumer function, a first response comprising the one or more recommendation actions; obtaining (808) an environment state of the wireless communication network; determining (810), from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjusting (812), based on the environment state and reward feedback information, the analytics algorithm.

Description

ENABLING RECOMMENDATION ANALYTICS IN A WIRELESS COMMUNICATION NETWORK

TECHNICAL FIELD

[0001] The subject matter disclosed herein relates generally to the field of enabling recommendations or prescriptive analytics in a wireless communication network. In particular this document defines an analytics network function (for instance a network data analytics function or logical network function) for wireless communication, a consumer function for wireless communication, a processor for wireless communication, and methods performed by an analytics network function, a consumer function and a processor.

BACKGROUND

[0002] A wireless communications system may include one or multiple network communication devices, such as base stations, which may support wireless communications for one or multiple user communication devices, which may be otherwise known as user equipment (UE), or other suitable terminology. The wireless communications system may support wireless communications with one or multiple user communication devices by utilizing resources of the wireless communication system (e.g., time resources (e.g., symbols, slots, subframes, frames, or the like) or frequency resources (e.g., subcarriers, carriers, or the like). Additionally, the wireless communications system may support wireless communications across various radio access technologies including third generation (3G) radio access technology, fourth generation (4G) radio access technology, fifth generation (5G) radio access technology, among other suitable radio access technologies beyond 5G (e.g., sixth generation (6G)).

[0003] Network analytics and AI/ML is deployed in the 5G core network via the introducing of a network data analytics function (NWDAF) that considers the support of various analytics types as elaborated in the 3GPP Specification TS 23.288, titled “Architecture enhancements for 5G System (5GS) to support network data analytics services”. Each NWDAF may support one or more Analytics IDs and may have the role of inference called NWDAF AnLF, or training called NWDAF MTLF or both. AnLF that support a specific Analytics ID inference subscribes to a corresponding MTLF that is responsible for training.

SUMMARY

[0004] An article “a” before an element is unrestricted and understood to refer to “at least one” of those elements or “one or more” of those elements. The terms “a,” “at least one,” “one or more,” and “at least one of one or more” may be interchangeable. As used herein, including in the claims, “or” as used in a list of items (e.g., a list of items prefaced by a phrase such as “at least one of’ or “one or more of’ or “one or both of’) indicates an inclusive list such that, for example, a list of at least one of A, B, or C means A or B or C or AB or AC or BC or ABC (i.e., A and B and C). Also, as used herein, the phrase “based on” shall not be construed as a reference to a closed set of conditions. For example, an example step that is described as “based on condition A” may be based on both a condition A and a condition B without departing from the scope of the present disclosure. In other words, as used herein, the phrase “based on” shall be construed in the same manner as the phrase “based at least in part on. Further, as used herein, including in the claims, a “set” may include one or more elements.

[0005] There is provided an analytics network function for wireless communication, comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to cause the analytics network function to: receive, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determine, using an analytics algorithm, one or more recommendation actions based on the first request; transmit, to the consumer function, a first response comprising the one or more recommendation actions; obtain an environment state of the wireless communication network; determine, from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjust, based on the environment state and reward feedback information, the analytics algorithm.

[0006] There is further provided a method performed by an analytics network function, comprising: receiving, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determining, using an analytics algorithm, one or more recommendation actions based on the first request; transmitting, to the consumer function, a first response comprising the one or more recommendation actions; obtaining an environment state of the wireless communication network; determining, from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjusting, based on the environment state and reward feedback information, the analytics algorithm.

[0007] There is further provided a consumer function for wireless communication comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to cause the consumer function to: transmit, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network; receive, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and transmit, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm.

[0008] There is further provided a method performed by a consumer function, comprising: transmitting, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network; receiving, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and transmitting, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm.

[0009] There is further provided a processor for wireless communication, comprising: at least one controller coupled with at least one memory and configured to cause the processor to: input a first request for recommendations or prescriptive analytics of a wireless communication network; obtain, using a analytics algorithm, one or more recommendation actions based on the first request; output a first response comprising the one or more recommendation actions; input or use an environment state of the wireless communication network; input or use a reward feedback information associated with the one or more recommendation actions; and adjust, based on the environment state and reward feedback information, the analytics algorithm. Such a processor may be used in an analytics network function.

[0010] There is further provided a method performed by a processor, comprising: inputting a first request for recommendations or prescriptive analytics of a wireless communication network; obtaining, using a analytics algorithm, one or more recommendation actions based on the first request; outputting a first response comprising the one or more recommendation actions; inputting an environment state of the wireless communication network; inputting a reward feedback information associated with the one or more recommendation actions; and adjusting, based on the environment state and reward feedback information, the analytics algorithm. Such a method may be performed by a processor of an analytics network function.

[0011] There is further provided a processor for wireless communication, comprising: at least one controller coupled with at least one memory and configured to cause the processor to: output a first request for recommendations or prescriptive analytics of a wireless communication network; input a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and output a third response comprising reward feedback information for adjusting the analytics algorithm. Such a processer may be used in a consumer function.

[0012] There is further provided a method performed by a processor comprising: outputting a first request for recommendations or prescriptive analytics of a wireless communication network; inputting a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and outputting a third response comprising reward feedback information for adjusting the analytics algorithm. Such a method may be performed by a processor of a consumer function.

BRIEF DESCRIPTION OF THE DRAWINGS

[0013] Figure 1 illustrates an example of a wireless communications system in accordance with aspects of the present disclosure. [0014] Figure 2 illustrates an example of an overview of NWDAF flavours including potential input data sources and output consumers, in accordance with aspects of the present disclosure.

[0015] Figure 3 illustrates an example of an overview of RL paradigm in accordance with aspects of the present disclosure.

[0016] Figure 4 illustrates an example of a recommendation request based on the RL paradigm in accordance with aspects of the present disclosure.

[0017] Figure 5 illustrates an example of a user equipment (UE) 500 in accordance with aspects of the present disclosure.

[0018] Figure 6 illustrates an example of a processor 600 in accordance with aspects of the present disclosure.

[0019] Figure 7 illustrates an example of a network equipment (NE) 700 in accordance with aspects of the present disclosure.

[0020] Figure 8 illustrates a flowchart of a method 800 performed by a NE in accordance with aspects of the present disclosure.

[0021] Figure 9 illustrates a flowchart of a method 900 performed by a NE in accordance with aspects of the present disclosure.

[0022] Figure 10 illustrates a flowchart of a method 1000 performed by a processor in accordance with aspects of the present disclosure.

[0023] Figure 11 illustrates a flowchart of a method 1100 performed by a processor in accordance with aspects of the present disclosure.

DETAILED DESCRIPTION

[0024] The 3GPP Specification TS 23.288 considers several analytics services identified by an Analytics ID. These analytics services rely on certain input data that needs to be collected and process by a machine learning (ML) model residing in NWDAF AnLF, in order to derive the analytics output. Typically, Analytics IDs rely on at least two or more input data that is collected from different data sources. NWDAF correlates different data inputs performing complex analytics that can be used as an insight by a consumer to assist in making decisions. An analytics ID can be related to user behaviour, communication patterns, mobility, service experience, NF load, slice load, for example.

[0025] Traditionally, network analytics provide an insight (i.e., an analysis based on statistics and-or predictions) to other NFs that use this information to take a decision, e.g., to choose a quality of service (QoS), a traffic steering, etc., for a future time interval. In 3 GPP Rel-19, further enhancements focusing on recommendations will be introduced to support NWDAF-assisted policy control and address network abnormal behaviour.

[0026] The NWDAF can gather data from 5GC NFs, application functions (AFs) and operations administration and maintenance (0AM), and hence can have a wide variety of knowledge as documented in clause 5.1.1 of the 3GPP Report TR 23.700-84 titled “Study on Core Network Enhanced Support for Artificial Intelligence (Al) / Machine Learning (ML)”. Hence the NWDAF can be enhanced to assist the PCF in determining QoS parameters that can achieve the expected service experience requirements. In other words, NWDAF instead of providing an insight, will offer recommendations as a set of potential solutions, out of which the consumer can choose the most suitable one.

[0027] Currently, the AI/ML models employed in NWDAF adopt either supervised or unsupervised learning. A consumer may optionally provide analytics feedback information related to the accuracy including: (i) Analytics ID and/or AI/ML Model ID; (ii) time stamp(s) of the action(s) taken; (iii) information on the location; (iv) time when a potential QoS change may occur; (v) reporting thresholds; (vi) indication whether the action will affect on ground truth data (if this is available). Such analytics feedback may be included in Nnwdaf_AnalyticsSubscription and Nnef_AnalyticsExposure services.

[0028] These types of Artificial Intelligence/Machine Learning (AI/ML) models provide an insight based on a model trained behaviour that relies on the collected data. They react slowly to dynamic conditions and unforeseen situations since new training is needed to adjust the model behaviour in the inference phase. In contrast Reinforcement Learning (RL) relies on an Agent that uses a goal directed strategy to provide recommendation actions. An RL Agent adopts a policy that learns to perform a task through an RL algorithm, which interacts with the environment via the means of an action, environment observations (e.g., network state or service state observations) and feedback reward provision, which serves as a measure of how successful an action was with respect to completing a task goal. The main components of RL will now be briefly introduced.

[0029] Network State Observation’ refers to input performance measurements (PMs) and KPIs, analytic service reports, to the optimization algorithm plus network configuration (e.g., which can be modelled as Markovian states).

[0030] Action’ refers to a parameters value, e.g., the network or service parameter value or a set of network parameter values, modified by the RL Agent.

[0031] Reward Feedback’ refers to a measure of a goal improvement, e.g., a network, or service, or application performance improvement.

[0032] Typically, RL requires an initial training that can be facilitated via a simulation toolbox or a digital twin. At the end of the training phase the RL Agent shall gain enough knowledge, to provide an action for potential environment states. Once the agent is applied for a specific network task and the achieved policy can be safely applied to the live or online network, the Agent will continue learning, i.e., to fine tune its policy, from the feedback reward in production also considering a configurable degree of exploration for learning new environment aspects.

[0033] Regarding the Management Data Analytics (MDA), recommendations are provided as an output of analytics optionally together with other analytics including statistics and predictions for the purpose of root cause analysis (e.g., to locate a problematic domain network component or identify the cause of a problem) and the components that need to adopt a specified configuration (e.g., NFs and gNBs that need to enter an energy saving state) as per the 3GPP Specification TS 28.104 titled “Management and orchestration; Management Data Analytics (MDA)”. However, these analytics are not based on RL since the AI/ML models rely on model training processes to update the accuracy and there is no reward feedback from the consumer.

[0034] The use of RL is currently introduced in the 3GPP 5G core in a via a generic framework, where the analytics service provides recommendations and receives environment states and feedback from the consumer with specifying the respective parameters. However, the new study would require further enhancements for introducing recommendations that can be perceived as a set of actions of an RL Agent output. The current 5G core has not yet considered how to handle RL and how to model at least: (i) the RL Agent actions as recommendations; (ii) the network environment observation as state information; and (iii) the feedback reward as a measure of goal impact or success/improvement.

[0035] The disclosure herein provides an apparatus and method that introduces RL in the 5G core network. RL can be introduced in NWDAF or in a logical function that can coexist with NWDAF or in another new NF responsible for prescriptive analytics including RL based recommendation analytics. It shall be noted that if RL is introduced in existing NWDAF then a new flag or other indication may be defined to indicate the choice of recommendation instead of predictions and statistics.

[0036] In any arrangement, the NWDAF or enhanced NWDAF or new NF that adopts RL as the AI/ML model provides actions as recommendations and obtains information related to the network environment either as observations, i.e., by monitoring the network state, and/or in the form of a reward feedback from the consumer.

[0037] Aspects of the present disclosure are described in the context of a wireless communications system.

[0038] Figure 1 illustrates an example of a wireless communications system 100 in accordance with aspects of the present disclosure. The wireless communications system 100 may include one or more NE 102, one or more UE 104, and a core network (CN) 106. The wireless communications system 100 may support various radio access technologies. In some implementations, the wireless communications system 100 may be a 4G network, such as an LTE network or an LTE- Advanced (LTE-A) network. In some other implementations, the wireless communications system 100 may be a NR network, such as a 5G network, a 5G- Advanced (5G-A) network, or a 5G ultrawideband (5G-UWB) network. In other implementations, the wireless communications system 100 may be a combination of a 4G network and a 5G network, or other suitable radio access technology including Institute of Electrical and Electronics Engineers (IEEE) 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20. The wireless communications system 100 may support radio access technologies beyond 5G, for example, 6G. Additionally, the wireless communications system 100 may support technologies, such as time division multiple access (TDMA), frequency division multiple access (FDMA), or code division multiple access (CDMA), etc.

[0039] The one or more NE 102 may be dispersed throughout a geographic region to form the wireless communications system 100. One or more of the NE 102 described herein may be or include or may be referred to as a network node, a base station, a network element, a network function, a network entity, a radio access network (RAN), a NodeB, an eNodeB (eNB), a next-generation NodeB (gNB), or other suitable terminology. An NE 102 and a UE 104 may communicate via a communication link, which may be a wireless or wired connection. For example, an NE 102 and a UE 104 may perform wireless communication (e.g., receive signalling, transmit signalling) over a Uu interface.

[0040] An NE 102 may provide a geographic coverage area for which the NE 102 may support services for one or more UEs 104 within the geographic coverage area. For example, an NE 102 and a UE 104 may support wireless communication of signals related to services (e.g., voice, video, packet data, messaging, broadcast, etc.) according to one or multiple radio access technologies. In some implementations, an NE 102 may be moveable, for example, a satellite associated with a non-terrestrial network (NTN). In some implementations, different geographic coverage areas associated with the same or different radio access technologies may overlap, but the different geographic coverage areas may be associated with different NE 102.

[0041] The one or more UE 104 may be dispersed throughout a geographic region of the wireless communications system 100. A UE 104 may include or may be referred to as a remote unit, a mobile device, a wireless device, a remote device, a subscriber device, a transmitter device, a receiver device, or some other suitable terminology. In some implementations, the UE 104 may be referred to as a unit, a station, a terminal, or a client, among other examples. Additionally, or alternatively, the UE 104 may be referred to as an Internet-of-Things (loT) device, an Internet-of-Everything (loE) device, or machine-type communication (MTC) device, among other examples. [0042] A UE 104 may be able to support wireless communication directly with other UEs 104 over a communication link. For example, a UE 104 may support wireless communication directly with another UE 104 over a device-to-device (D2D) communication link. In some implementations, such as vehicle-to-vehicle (V2V) deployments, vehicle-to-everything (V2X) deployments, or cellular-V2X deployments, the communication link may be referred to as a sidelink. For example, a UE 104 may support wireless communication directly with another UE 104 over a PC5 interface.

[0043] An NE 102 may support communications with the CN 106, or with another NE

102, or both. For example, an NE 102 may interface with other NE 102 or the CN 106 through one or more backhaul links (e.g., SI, N2, N2, or network interface). In some implementations, the NE 102 may communicate with each other directly. In some other implementations, the NE 102 may communicate with each other or indirectly (e.g., via the CN 106. In some implementations, one or more NE 102 may include subcomponents, such as an access network entity, which may be an example of an access node controller (ANC). An ANC may communicate with the one or more UEs 104 through one or more other access network transmission entities, which may be referred to as a radio heads, smart radio heads, or transmission-reception points (TRPs).

[0044] The CN 106 may support user authentication, access authorization, tracking, connectivity, and other access, routing, or mobility functions. The CN 106 may be an evolved packet core (EPC), or a 5G core (5GC), which may include a control plane entity that manages access and mobility (e.g., a mobility management entity (MME), an access and mobility management functions (AMF)) and a user plane entity that routes packets or interconnects to external networks (e.g., a serving gateway (S-GW), a Packet Data Network (PDN) gateway (P-GW), or a user plane function (UPF)). In some implementations, the control plane entity may manage non-access stratum (NAS) functions, such as mobility, authentication, and bearer management (e.g., data bearers, signal bearers, etc.) for the one or more UEs 104 served by the one or more NE 102 associated with the CN 106.

[0045] The CN 106 may communicate with a packet data network over one or more backhaul links (e.g., via an SI, N2, N2, or another network interface). The packet data network may include an application server. In some implementations, one or more UEs 104 may communicate with the application server. A UE 104 may establish a session (e.g., a protocol data unit (PDU) session, or the like) with the CN 106 via an NE 102. The CN 106 may route traffic (e.g., control information, data, and the like) between the UE 104 and the application server using the established session (e.g., the established PDU session). The PDU session may be an example of a logical connection between the UE 104 and the CN 106 (e.g., one or more network functions of the CN 106).

[0046] In the wireless communications system 100, the NEs 102 and the UEs 104 may use resources of the wireless communications system 100 (e.g., time resources (e.g., symbols, slots, subframes, frames, or the like) or frequency resources (e.g., subcarriers, carriers)) to perform various operations (e.g., wireless communications). In some implementations, the NEs 102 and the UEs 104 may support different resource structures. For example, the NEs 102 and the UEs 104 may support different frame structures. In some implementations, such as in 4G, the NEs 102 and the UEs 104 may support a single frame structure. In some other implementations, such as in 5 G and among other suitable radio access technologies, the NEs 102 and the UEs 104 may support various frame structures (i.e., multiple frame structures). The NEs 102 and the UEs 104 may support various frame structures based on one or more numerologies.

[0047] One or more numerologies may be supported in the wireless communications system 100, and a numerology may include a subcarrier spacing and a cyclic prefix. A first numerology (e.g., /r=0) may be associated with a first subcarrier spacing (e.g., 15 kHz) and a normal cyclic prefix. In some implementations, the first numerology (e.g., /r=0) associated with the first subcarrier spacing (e.g., 15 kHz) may utilize one slot per subframe. A second numerology (e.g., /r=l) may be associated with a second subcarrier spacing (e.g., 30 kHz) and a normal cyclic prefix. A third numerology (e.g., /r=2) may be associated with a third subcarrier spacing (e.g., 60 kHz) and a normal cyclic prefix or an extended cyclic prefix. A fourth numerology (e.g., /r=3) may be associated with a fourth subcarrier spacing (e.g., 120 kHz) and a normal cyclic prefix. A fifth numerology (e.g., /r=4) may be associated with a fifth subcarrier spacing (e.g., 240 kHz) and a normal cyclic prefix.

[0048] A time interval of a resource (e.g., a communication resource) may be organized according to frames (also referred to as radio frames). Each frame may have a duration, for example, a 10 millisecond (ms) duration. In some implementations, each frame may include multiple subframes. For example, each frame may include 10 subframes, and each subframe may have a duration, for example, a 1 ms duration. In some implementations, each frame may have the same duration. In some implementations, each subframe of a frame may have the same duration.

[0049] Additionally or alternatively, a time interval of a resource (e.g., a communication resource) may be organized according to slots. For example, a subframe may include a number (e.g., quantity) of slots. The number of slots in each subframe may also depend on the one or more numerologies supported in the wireless communications system 100. For instance, the first, second, third, fourth, and fifth numerologies (i.e., /r=0, jU=l, /r=2, jU=3, /r=4) associated with respective subcarrier spacings of 15 kHz, 30 kHz, 60 kHz, 120 kHz, and 240 kHz may utilize a single slot per subframe, two slots per subframe, four slots per subframe, eight slots per subframe, and 16 slots per subframe, respectively.# Each slot may include a number (e.g., quantity) of symbols (e.g., OFDM symbols). In some implementations, the number (e.g., quantity) of slots for a subframe may depend on a numerology. For a normal cyclic prefix, a slot may include 14 symbols. For an extended cyclic prefix (e.g., applicable for 60 kHz subcarrier spacing), a slot may include 12 symbols. The relationship between the number of symbols per slot, the number of slots per subframe, and the number of slots per frame for a normal cyclic prefix and an extended cyclic prefix may depend on a numerology. It should be understood that reference to a first numerology (e.g., /r=0) associated with a first subcarrier spacing (e.g., 15 kHz) may be used interchangeably between subframes and slots.

[0050] In the wireless communications system 100, an electromagnetic (EM) spectrum may be split, based on frequency or wavelength, into various classes, frequency bands, frequency channels, etc. By way of example, the wireless communications system 100 may support one or multiple operating frequency bands, such as frequency range designations FR1 (410 MHz - 7.125 GHz), FR2 (24.25 GHz - 52.6 GHz), FR3 (7.125 GHz - 24.25 GHz), FR4 (52.6 GHz - 114.25 GHz), FR4a or FR4-1 (52.6 GHz - 71 GHz), and FR5 (114.25 GHz - 300 GHz). In some implementations, the NEs 102 and the UEs 104 may perform wireless communications over one or more of the operating frequency bands. In some implementations, FR1 may be used by the NEs 102 and the UEs 104, among other equipment or devices for cellular communications traffic (e.g., control information, data). In some implementations, FR2 may be used by the NEs 102 and the UEs 104, among other equipment or devices for short-range, high data rate capabilities.

[0051] FR1 may be associated with one or multiple numerologies (e.g., at least three numerologies). For example, FR1 may be associated with a first numerology (e.g., /r=0), which includes 15 kHz subcarrier spacing; a second numerology (e.g., /r=l), which includes 30 kHz subcarrier spacing; and a third numerology (e.g., /r=2), which includes 60 kHz subcarrier spacing. FR2 may be associated with one or multiple numerologies (e.g., at least 2 numerologies). For example, FR2 may be associated with a third numerology (e.g., /r=2), which includes 60 kHz subcarrier spacing; and a fourth numerology (e.g., /r=3), which includes 120 kHz subcarrier spacing.

[0052] Figure 2 illustrates an example 200 of an overview of NWDAF flavours including potential input data sources and output consumers, in accordance with aspects of the present disclosure.

[0053] The various NWDAF flavours and their respective input data and output result consumers are illustrated. Output result consumers may include 5G core Network Functions (NFs), Application Functions (AFs), 5G core repositories, e.g., Network Repository Function (NRF), Unified Data Management (UDM), etc., and the Operations, Administration and Maintenance (0AM) (Management Service (MnS) Consumer or Management Function (MF)).

[0054] As illustrated in Figure 2, a first Data Collection Coordination Functionality (DCCF) 212 receives inputs from 5G Core Network Functions 202, Application Functions 204, untrusted application functions 204 via a network exposure function 206, 5G core repositories 208, and 0AM data 210. The 5G Core Repositories 208 may comprise a NRF, a Binding Support Function (BSF), an Analytics Data Repository Function (ADRF), a UDM, and/or a Unified Data Repository (UDR). The 0AM 210 may comprise a MnS Producer or an MF that provides Performance Measurements, Key Performance indicators, Configuration Management, and Alarm information. Optionally, the first DCCF 212 may provide data to an NWDAF AnLF/MTLF 214, an NWDAF AnLF 216, and an NWDAF MTLF 218.

[0055] The NWDAF containing AnLF/MTLF 214, the NWDAF containing AnLF 216, and the NWDAF containing MTLF218 may pass on data to the second DCCF 222. The second DCCF 222 may further provide data to the 5G Core Network Functions 224, Application Functions 228, untrusted application functions 228 via a network exposure function 226, 5G core repositories 230, and 0AM data 232. The 5G Core Repositories 230 may comprise an ADRF, a UDM, and/or a Unified Data Repository (UDR). The 0AM 232 may comprise a Management Services (MnS) Consumer or a Management Function (MF).

[0056] In operation, MTLF 218 and AnLF 216 may exchange AI/ML models. Such exchange may be performed by means of serialization or containerization or via parameters or weights exchange. Optionally, DCCF and MFAF may be involved in the AI/ML model exchange to distribute and collect repeated data towards or from various data sources.

[0057] Figure 3 illustrates an example 300 of an overview of RL paradigm in accordance with aspects of the present disclosure.

[0058] A reinforcement learning agent 310 is shown as comprising a policy 312 and a reinforcement learning algorithm 314. Also shown is an environment 320. Certain procedural steps of reinforcement learning will now be described.

[0059] The reinforcement learning agent 310 take an action 301 based on policy 312. The action 301 has an effect on environment 320. A feedback reward 302 is provided from the environment 320 to the reinforcement learning algorithm 314 in reinforcement learning agent 310. An observation 303 of the environment 320 is also made following the action 301. The observation 303 and feedback reward 302 are used to adjust reinforcement learning algorithm 314. The reinforcement learning algorithm 314 then updates policy 312 accordingly. The procedure in the example 300 may be repeated to iteratively update policy 312 and reinforcement learning algorithm 314 based on the actions 301, feedback rewards 302 and observations 303.

[0060] The reinforcement learning agent 310 may be an analytics network function as referred to herein. [0061] As described herein, the NWDAF or enhanced NWDAF or new NF that adopts RL as the AI/ML model/analytics algorithm, may need to initially register its RL capability to a network repository function (NRF) either directly, i.e., by informing the NRF, or by requested/causing operations administration and maintenance (0AM) to configure the NRF with this capability for the respective NWDAF or enhanced NWDAF or new NF that adopts RL as the AI/ML model/analytics algorithm. To identify this 5G NF, an identifier may be needed to indicate the support of recommendation actions based on RL.

[0062] The RL model related capabilities registered in NRF may include at least one the following: an Analytics ID(s) to identify the analytics (or purpose) for which the RL AI/ML model/analytics algorithm would provide recommendation actions, e.g., policy selection, UPF selection; a Model ID related to the RL AI/MLmodel/analytics algorithm adopted to provide the respective Analytics ID; NF consumer information or interoperability information that identifies the consumers that can use the RL AI/ML model/analytics algorithm and/or are eligible to provide reward feedback; Use case context that indicates the context of use of the analytics to select the most relevant RL AI/ML model/analytics algorithm; Filter information including, e.g., S-NSSAI, Area of Interest; Input data information, i.e., statistics (such as limits, range, granularity) regarding the data used in the initial training phase, so that the RL based analytics can ask for feedback if the data is outside these limits; Accuracy level(s) that is expected based on the training phase.

[0063] A RL analytics consumer (i.e., a consumer function such as a PCF) may request recommendation actions by issuing a request or subscription including at least one the following attributes: Analytics ID(s) to identify the analytics (or purpose) for which a RL AI/ML model/analytics algorithm would provide recommendation actions, e.g., policy selection, UPF selection; a Model ID related to the RL AI/ML model/analytics algorithm adopted to provide the respective Analytics ID; Filter information including, e.g., S-NSSAI, Area of Interest; time schedule information, i.e., when the recommendation action is needed, e.g., on regular basis, at specific times, urgently, or delay tolerant (adopting the per Event Reporting parameters in Table 4.15.1-1 of the 3 GPP Specification TS 23.502 titled “Procedures for the 5G System (5GS)”); a time duration related to the recommendation action; a use case context that indicates the context of usage of the recommendation to select the most relevant RL AI/ML model/analytics algorithm (i.e., when several RL models are available when requesting recommendations per Analytics ID) and/or the most relevant policy for a specific RL AI/ML model/analytics algorithm; a target of recommendation action that indicates the object(s) for which recommendations are requested, which may include entities such as specific UEs, a group of UE(s) or any UE (i.e., all UEs); a preferred level of accuracy of the recommendations ("Low", "Medium", "High" or "Highest") considering the entire time interval or specified times; a preferred input, e.g., data samples, sample granularity, data sources and/or data statistics, for producing the recommendations; reporting thresholds, which indicate conditions on the level of each requested recommendation/s that when reached shall be notified; a notification target address and/or a Notification Correlation ID in case there is a subscription for recommendations.

[0064] The analytics network function as described herein (i.e., the NWDAF) recommendation actions may include at least one of the following: a task identifier, i.e., an umbrella identifier that characterizes a task, e.g., policy identifier; component names or components identifiers related to a task, i.e., KPIs or parameters (such as throughput, latency) or SLA/QoS parameters, related to a policy; a purpose of the recommendation, e.g., for calculation or selection of a policy or network entity, or for negotiation, or for feasibility check; values related to each component, e.g., of QoS parameters; duration and time schedule of the recommendation; a timestamp related to the production of a recommendation action and/or validity, i.e., until when a recommendation action is valid to be consumed; filter information related to the usage and device applicability including a) area or location in which each recommendation is applicable, e.g., for a mobile user there may be different recommendations per area or location; and/or b) UE conditions related to a recommendation including, e.g., mobility (speed, direction), communication patterns; and/or c) device type, such as mobile user phone, drone, vehicle, sensor device (static or mobile); a network context, i.e., conditions in where a recommendation is applicable, e.g., KPI range or upper value of the expected load; expected reward feedback information related to the capability of the consumer to provide a reward feedback report including the respective timing, type of feedback, e.g., discrete - satisfied/not satisfied or continuous - QoS achieved, for providing such reward feedback; a use case, e.g., energy saving, where a recommendation is applicable; a confidence degree related to each recommendation, i.e., provided per task or component identifier or both.

[0065] The observation needs to characterize the state related to the environment after executing a recommendation, which itself can be modelled, e.g., as Markovian states, i.e., states where well-defined parameters such as handover number, load, QoS level, etc. change. Changes in an environment state requires a monitoring process that collects the respective performance measurements or KPIs or analytics (statistics or predictions) or Configuration Management (CM) data provided by the 0AM or other analytics services reports (e.g., network performance analytics or service experience analytics) and relates them, specifying a state with respect to the goal or set of goals of the RL based recommendations. An environment state can be continuous, i.e., can change once each related parameter changes, or it can be discrete, i.e., change once a parameter or set of parameters cross a specified threshold or limit. Examples of environment states may include: Network performance including a single parameter, e.g., load or congestion level (high, medium, low) in relation to an NF or area of interest, or a set of parameters, e.g., load and latency in relation to an NF or area of interest; Network configuration (i.e., CM provided by the 0AM) including e.g., the energy saving state related to network equipment, considering 5GNFs and/or gNBs or both; Service experience including a single parameter, e.g., throughput or QoS (that can be a real value or discrete - high, medium, low) in relation to a specific policy or SLA or a set of parameters, e.g., QoS and energy consumption (that can be a real value or discrete - high, medium, low) in relation to a specific policy or SLA; user mobility considering the cell or TA handover or movement of user out/in a specified geographical area before providing an update or modification, e.g., related to route optimization.

[0066] The consumer feedback reward may contain at least one or a combination of: a reward feedback value that can be a real value, e.g., an indication of QoS achieved (including the distance from the expected QoS), or a binary value, i.e., prediction success or failure, or a quantified value to show the degree of satisfaction when the prediction results provided were used; a time stamp to indicate the time when the reward feedback was issued; a validity period that indicates until when the reward feedback value shall be used; the usage description of the provided Event ID which may include at least one of a) the time schedule that indicates when the prediction results provided were used; and/or b) the time duration that indicates for how long the prediction results provided were used and/or c) the use case context wherein the prediction results provided were used, e.g., for energy saving, load balancing, etc and/or d) the target object where the prediction results provided were used, which may include a UE ID or a group UE ID, a NF (including NF type), geographical, e.g., Area of Interest, cells or tracking area; the confidence degree related to the reward feedback provided; the Vendor ID that indicates the vendor that issued the reward feedback; the Analytics ID that the feedback reward is related to or in case of a subscription reusing the Correlation ID, i.e., an identifier of an existing subscription that this feedback is related to or a new Feedback Correlation ID associated with the feedback of a subscription; information related to further reward feedback reports including number of scheduled future feedback reports, the time when the next feedback report is scheduled.

[0067] The consumer reward feedback can be requested by the NWDAF or enhanced NWDAF or new NF that provides the recommendations by either issuing a new request or subscription related to requesting a feedback reward or by piggybacking the reward feedback request into the recommendations action/s. In either case the NF request for a consumer reward feedback report may contain at least one or a combination of: the Analytics ID(s) or recommendation ID that the reward feedback is needed for, or in case of a subscription the Correlation ID or a new Feedback Correlation ID; the target period, i.e., the time interval when the feedback reward report is needed; the time or time schedule (in case of a subscription) for providing a feedback reward report; the desired accuracy of the feedback reward; a Notification Target Address to provide the feedback reward report.

[0068] Reward feedback from a single consumer in a multi-vendor environment, i.e., wherein the predictive NF data producer and the NF consumer are developed by different vendors, may be perceived as being bias or unfair, e.g., especially when the feedback is negative. This introduces a major challenge. To overcome this situation, a common approach is to allow the RL algorithm in the RL agent to consider and note at least one or a combination of: the vendor identifier of the reward feedback, enabling the RL agent to accept or reject rewards from certain vendors and/or mark and compare the rewards from different vendors to create knowledge on how different vendors perceive the recommendations considering but not limited to: (i) the usage (i.e., use case), (ii) time frame or time schedule of using the recommendation, (iii) geographic area wherein the recommendations were applied, (iv) the service or application or slice identifier wherein the recommendations were applied, (v) the mobile user type (human, drone, car, sensor) or considering the mobility profile (static, high/medium mobility) where the recommendations were applied; and multiple consumer reward feedback before taking any corrective policy actions or updates, considering the time frame for collecting reward feedback or the number of reports or a specified condition, e.g., a threshold or limit. Multiple consumer reward feedback can then be combined, e.g., be aggregated, averaged, by using weights, or even employing an AI/ML logic. The logic of combining multiple feedback rewards and relating it with the observation of the environment state is internal in the RL agent (i.e., within the analytics network function). However, the RL agent may keep track of the consumer characteristics in relation to the observation environment state to enhance its learning capabilities.

[0069] Figure 4 illustrates an example 400 of a recommendation request based on the RL paradigm in accordance with aspects of the present disclosure.

[0070] The example 400 shows I recommendation consumer/consumer entity 410 as a PCF that is responsible for selecting a policy that would suit a requested SLA. It shall be noted that an SLA can be realized in various ways, i.e., by combining the parameters of throughput, latency, etc., in different ways with the role of recommendations to provide different realizations for the PCF to choose from.

[0071] The example 400 also shows a NRF 420, an analytics network function 430 as an NWDAF (although alternatively the function 430 may be an enhanced NWDAF or other NF for having capability for prescriptive analytics), and a 5G core system 440. The 5G system 440 may include other NWDAFs, 0AM etc for providing information on the network and/or a service state.

[0072] In a first step 401, the consumer function 410 (i.e., the Recommendation Consumer which is a PCF) has received a request from an application or service for policy provision with a specified SLA and requires recommendations to select or renegotiate a policy.

[0073] In a further step 402, the consumer function 410 (PCF) needs to discover from the NRF 420 the appropriate NWDAF 430 (or enhanced NWDAF or NF) that can provide the desired recommendations based on the NWDAF 430 (or enhanced NWDAF or NF) recommendation capabilities registered in the NRF 420. This is shown as “discover process (recommendation capabilities)”.

[0074] In a further step 403, once the consumer function 410 (PCF) selects the appropriate NWDAF 430 (or enhanced NWDAF or NF) that can provide the desired recommendations, the consumer function 410 it issues a request or subscription (to analytics network function 430) to collect the desired recommendations for a requested policy provision (i.e., from a PCF consumer), based on the PCF consumer SLA. The consumer function 410 (PCF) may request recommendation actions by including any of the attributes described in the present disclosure in the issuing request or subscription. This is shown as “Request or Subscribe (recommendation/analytics ID)”.

[0075] In a further step 404, once the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) receives a request or subscription from the consumer function 410, and before it determines the desired recommendations, it needs to request or subscribe to the 5G core network 540 to receive the environment state. The notion of environment relates to the goal or SLA of the policy recommendations and can be obtained either by issuing a request or subscription for: respective analytic services, i.e., Analytics ID, related to network conditions, e.g., congestion or load in an area of interest or for specified NFs, or service experience related to an application; respective Events IDs or performance measurements and KPIs related to the fulfilment of the given SLA in relation with the policy provision or configuration management states provided by the 0AM. This is shown as “Request or subscribe (Network/service state)”.

[0076] It shall be noted that the time schedule for requesting and subscribing for an environment state may vary depending on the time schedule of the request. [0077] In a further step 405, the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) provides the recommendation actions to the consumer function 410 (PCF) including also the required meta data attributes. Two different options denoted ‘Option 1 ’ and ‘Option 2’ may then be executed. The two options will be separately described.

[0078] In option 1, step 405a is followed and the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) provides a request for reward feedback piggybacked in the recommendation actions response or notification. This is shown as “Respond or Notify (Recommendation). Include Request or Subscription to reward feedback”.

[0079] In option 2, step 405b is followed and the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) provides the recommendation actions response or notification. This is shown as “Respond or notify (Recommendation). Further step 405c, is also followed, wherein the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) requests or subscribes to the consumer function 410 (PCF) for receiving reward feedback. This is shown as “Request or subscription to reward feedback (Recommendation)”.

[0080] It shall be noted that the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) will request reward feedback considering the vendor identity (i.e., excluding/including certain vendors).

[0081] In further step 406, the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) receives a response or notification from the 5G system 440 of the network or service state by receiving the respective analytic services report and/or the respective Events IDs or performance measurements and KPIs, or configuration management data or a combination thereof depending on the SLA and goals of the requested policy. This is shown as “Respond or Notify (Network/service state)”.

[0082] In further step 407, the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) receives the requested reward feedback response or notification from consumer function 410. This is shown as “Reward Feedback Response or Notify”. [0083] It shall be noted that the analytics network function (NWDAF) 430 (or enhanced NWDAF or NF) will process the received reward feedback based on the logic adopted, i.e., considering a certain time window and schedule or a predetermined number of received reward feedback reports or an imposed performance threshold or limit, e.g., network load.

[0084] Figure 5 illustrates an example of a UE 500 in accordance with aspects of the present disclosure. The UE 500 may include a processor 502, a memory 504, a controller 506, and a transceiver 508. The processor 502, the memory 504, the controller 506, or the transceiver 508, or various combinations thereof or various components thereof may be examples of means for performing various aspects of the present disclosure as described herein. These components may be coupled (e.g., operatively, communicatively, functionally, electronically, electrically) via one or more interfaces.

[0085] The processor 502, the memory 504, the controller 506, or the transceiver 508, or various combinations or components thereof may be implemented in hardware (e.g., circuitry). The hardware may include a processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), or other programmable logic device, or any combination thereof configured as or otherwise supporting a means for performing the functions described in the present disclosure.

[0086] The processor 502 may include an intelligent hardware device (e.g., a general- purpose processor, a DSP, a CPU, an ASIC, an FPGA, or any combination thereof). In some implementations, the processor 502 may be configured to operate the memory 504. In some other implementations, the memory 504 may be integrated into the processor 502.

The processor 502 may be configured to execute computer-readable instructions stored in the memory 504 to cause the UE 500 to perform various functions of the present disclosure.

[0087] The memory 504 may include volatile or non-volatile memory. The memory 504 may store computer-readable, computer-executable code including instructions when executed by the processor 502 cause the UE 500 to perform various functions described herein. The code may be stored in a non-transitory computer-readable medium such the memory 504 or another type of memory. Computer-readable media includes both non- transitory computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A non-transitory storage medium may be any available medium that may be accessed by a general-purpose or special-purpose computer.

[0088] In some implementations, the processor 502 and the memory 504 coupled with the processor 502 may be configured to cause the UE 500 to perform one or more of the functions described herein (e.g., executing, by the processor 502, instructions stored in the memory 504). For example, the processor 502 may support wireless communication at the UE 500 in accordance with examples as disclosed herein. The UE 500 may be configured to support a means for performing aspects of the disclosure herein.

[0089] The controller 506 may manage input and output signals for the UE 500. The controller 506 may also manage peripherals not integrated into the UE 500. In some implementations, the controller 506 may utilize an operating system such as iOS®, ANDROID®, WINDOWS®, or other operating systems. In some implementations, the controller 506 may be implemented as part of the processor 502.

[0090] In some implementations, the UE 500 may include at least one transceiver 508. In some other implementations, the UE 500 may have more than one transceiver 508. The transceiver 508 may represent a wireless transceiver. The transceiver 508 may include one or more receiver chains 510, one or more transmitter chains 512, or a combination thereof.

[0091] A receiver chain 510 may be configured to receive signals (e.g., control information, data, packets) over a wireless medium. For example, the receiver chain 510 may include one or more antennas for receive the signal over the air or wireless medium. The receiver chain 510 may include at least one amplifier (e.g., a low- noise amplifier (LNA)) configured to amplify the received signal. The receiver chain 510 may include at least one demodulator configured to demodulate the receive signal and obtain the transmitted data by reversing the modulation technique applied during transmission of the signal. The receiver chain 510 may include at least one decoder for decoding the processing the demodulated signal to receive the transmitted data.

[0092] A transmitter chain 512 may be configured to generate and transmit signals (e.g., control information, data, packets). The transmitter chain 512 may include at least one modulator for modulating data onto a carrier signal, preparing the signal for transmission over a wireless medium. The at least one modulator may be configured to support one or more techniques such as amplitude modulation (AM), frequency modulation (FM), or digital modulation schemes like phase-shift keying (PSK) or quadrature amplitude modulation (QAM). The transmitter chain 512 may also include at least one power amplifier configured to amplify the modulated signal to an appropriate power level suitable for transmission over the wireless medium. The transmitter chain 512 may also include one or more antennas for transmitting the amplified signal into the air or wireless medium.

[0093] Figure 6 illustrates an example of a processor 600 in accordance with aspects of the present disclosure. The processor 600 may be an example of a processor configured to perform various operations in accordance with examples as described herein. The processor 600 may include a controller 602 configured to perform various operations in accordance with examples as described herein. The processor 600 may optionally include at least one memory 604, which may be, for example, an L1/L2/L3 cache. Additionally, or alternatively, the processor 600 may optionally include one or more arithmetic-logic units (ALUs) 606. One or more of these components may be in electronic communication or otherwise coupled (e.g., operatively, communicatively, functionally, electronically, electrically) via one or more interfaces (e.g., buses).

[0094] The processor 600 may be a processor chipset and include a protocol stack (e.g., a software stack) executed by the processor chipset to perform various operations (e.g., receiving, obtaining, retrieving, transmitting, outputting, forwarding, storing, determining, identifying, accessing, writing, reading) in accordance with examples as described herein. The processor chipset may include one or more cores, one or more caches (e.g., memory local to or included in the processor chipset (e.g., the processor 600) or other memory (e.g., random access memory (RAM), read-only memory (ROM), dynamic RAM (DRAM), synchronous dynamic RAM (SDRAM), static RAM (SRAM), ferroelectric RAM (FeRAM), magnetic RAM (MRAM), resistive RAM (RRAM), flash memory, phase change memory (PCM), and others).

[0095] The controller 602 may be configured to manage and coordinate various operations (e.g., signalling, receiving, obtaining, retrieving, transmitting, outputting, forwarding, storing, determining, identifying, accessing, writing, reading) of the processor 600 to cause the processor 600 to support various operations in accordance with examples as described herein. For example, the controller 602 may operate as a control unit of the processor 600, generating control signals that manage the operation of various components of the processor 600. These control signals include enabling or disabling functional units, selecting data paths, initiating memory access, and coordinating timing of operations.

[0096] The controller 602 may be configured to fetch (e.g., obtain, retrieve, receive) instructions from the memory 604 and determine subsequent instruction(s) to be executed to cause the processor 600 to support various operations in accordance with examples as described herein. The controller 602 may be configured to track memory address of instructions associated with the memory 604. The controller 602 may be configured to decode instructions to determine the operation to be performed and the operands involved. For example, the controller 602 may be configured to interpret the instruction and determine control signals to be output to other components of the processor 600 to cause the processor 600 to support various operations in accordance with examples as described herein. Additionally, or alternatively, the controller 602 may be configured to manage flow of data within the processor 600. The controller 602 may be configured to control transfer of data between registers, arithmetic logic units (ALUs), and other functional units of the processor 600.

[0097] The memory 604 may include one or more caches (e.g., memory local to or included in the processor 600 or other memory, such RAM, ROM, DRAM, SDRAM, SRAM, MRAM, flash memory, etc. In some implementations, the memory 604 may reside within or on a processor chipset (e.g., local to the processor 600). In some other implementations, the memory 604 may reside external to the processor chipset (e.g., remote to the processor 600).

[0098] The memory 604 may store computer-readable, computer-executable code including instructions that, when executed by the processor 600, cause the processor 600 to perform various functions described herein. The code may be stored in a non-transitory computer-readable medium such as system memory or another type of memory. The controller 602 and/or the processor 600 may be configured to execute computer- readable instructions stored in the memory 604 to cause the processor 600 to perform various functions. For example, the processor 600 and/or the controller 602 may be coupled with or to the memory 604, the processor 600, the controller 602, and the memory 604 may be configured to perform various functions described herein. In some examples, the processor 600 may include multiple processors and the memory 604 may include multiple memories. One or more of the multiple processors may be coupled with one or more of the multiple memories, which may, individually or collectively, be configured to perform various functions herein.

[0099] The one or more ALUs 606 may be configured to support various operations in accordance with examples as described herein. In some implementations, the one or more ALUs 606 may reside within or on a processor chipset (e.g., the processor 600). In some other implementations, the one or more ALUs 606 may reside external to the processor chipset (e.g., the processor 600). One or more ALUs 606 may perform one or more computations such as addition, subtraction, multiplication, and division on data. For example, one or more ALUs 606 may receive input operands and an operation code, which determines an operation to be executed. One or more ALUs 606 be configured with a variety of logical and arithmetic circuits, including adders, subtractors, shifters, and logic gates, to process and manipulate the data according to the operation. Additionally, or alternatively, the one or more ALUs 606 may support logical operations such as AND, OR, exclusive-OR (XOR), not-OR (NOR), and not- AND (NAND), enabling the one or more ALUs 606 to handle conditional operations, comparisons, and bitwise operations.

[0100] The processor 600 may support wireless communication in accordance with examples as disclosed herein. The processor 600 may be configured to or operable to support a means for performing aspects of the disclosure herein, such as the method 1000 of Figure 10 or the method 1100 of Figure 11.

[0101] Figure 7 illustrates an example of a NE 700 in accordance with aspects of the present disclosure. The NE 700 may include a processor 702, a memory 704, a controller 706, and a transceiver 708. The processor 702, the memory 704, the controller 706, or the transceiver 708, or various combinations thereof or various components thereof may be examples of means for performing various aspects of the present disclosure as described herein. These components may be coupled (e.g., operatively, communicatively, functionally, electronically, electrically) via one or more interfaces.

[0102] The processor 702, the memory 704, the controller 706, or the transceiver 708, or various combinations or components thereof may be implemented in hardware (e.g., circuitry). The hardware may include a processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), or other programmable logic device, or any combination thereof configured as or otherwise supporting a means for performing the functions described in the present disclosure.

[0103] The processor 702 may include an intelligent hardware device (e.g., a general- purpose processor, a DSP, a CPU, an ASIC, an FPGA, or any combination thereof). In some implementations, the processor 702 may be configured to operate the memory 704. In some other implementations, the memory 704 may be integrated into the processor 702. The processor 702 may be configured to execute computer-readable instructions stored in the memory 704 to cause the NE 700 to perform various functions of the present disclosure.

[0104] The memory 704 may include volatile or non-volatile memory. The memory 704 may store computer-readable, computer-executable code including instructions when executed by the processor 702 cause the NE 700 to perform various functions described herein. The code may be stored in a non-transitory computer-readable medium such the memory 704 or another type of memory. Computer-readable media includes both non- transitory computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A non-transitory storage medium may be any available medium that may be accessed by a general-purpose or special-purpose computer.

[0105] In some implementations, the processor 702 and the memory 704 coupled with the processor 702 may be configured to cause the NE 700 to perform one or more of the functions described herein (e.g., executing, by the processor 702, instructions stored in the memory 704). For example, the processor 702 may support wireless communication at the NE 700 in accordance with examples as disclosed herein. The NE 700 may be configured to support a means for performing aspects of the disclosure herein such as the method 800 of Figure 8 or the method 900 of Figure 9. The NE 700 may be a reinforcement learning agent 310 of Figure 3, a consumer function 410, a NRF 420 or analytics network function 430 of Figure 4, for instance.

[0106] The controller 706 may manage input and output signals for the NE 700. The controller 706 may also manage peripherals not integrated into the NE 700. In some implementations, the controller 706 may utilize an operating system such as iOS®, ANDROID®, WINDOWS®, or other operating systems. In some implementations, the controller 706 may be implemented as part of the processor 702.

[0107] In some implementations, the NE 700 may include at least one transceiver 708. In some other implementations, the NE 700 may have more than one transceiver 708. The transceiver 708 may represent a wireless transceiver. The transceiver 708 may include one or more receiver chains 710, one or more transmitter chains 712, or a combination thereof.

[0108] A receiver chain 710 may be configured to receive signals (e.g., control information, data, packets) over a wireless medium. For example, the receiver chain 710 may include one or more antennas for receive the signal over the air or wireless medium. The receiver chain 710 may include at least one amplifier (e.g., a low- noise amplifier (LN A)) configured to amplify the received signal. The receiver chain 710 may include at least one demodulator configured to demodulate the receive signal and obtain the transmitted data by reversing the modulation technique applied during transmission of the signal. The receiver chain 710 may include at least one decoder for decoding the processing the demodulated signal to receive the transmitted data.

[0109] A transmitter chain 712 may be configured to generate and transmit signals (e.g., control information, data, packets). The transmitter chain 712 may include at least one modulator for modulating data onto a carrier signal, preparing the signal for transmission over a wireless medium. The at least one modulator may be configured to support one or more techniques such as amplitude modulation (AM), frequency modulation (FM), or digital modulation schemes like phase-shift keying (PSK) or quadrature amplitude modulation (QAM). The transmitter chain 712 may also include at least one power amplifier configured to amplify the modulated signal to an appropriate power level suitable for transmission over the wireless medium. The transmitter chain 712 may also include one or more antennas for transmitting the amplified signal into the air or wireless medium. [0110] Figure 8 illustrates a flowchart of a method 800 in accordance with aspects of the present disclosure. The operations of the method may be implemented by a NE as described herein. In some implementations, the NE may execute a set of instructions to control the function elements of the NE to perform the described functions.

[0111] At 802, the method 800 may include receiving, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network. The operations of 802 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 802 may be performed by a NE as described with reference to Figure 7.

[0112] At 804, the method 800 may include determining, using an analytics algorithm, one or more recommendation actions based on the first request. The operations of 804 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 804 may be performed by a NE as described with reference to Figure 7.

[0113] At 806, the method 800 may include transmitting, to the consumer function, a first response comprising the one or more recommendation actions. The operations of 806 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 806 may be performed a NE as described with reference to Figure 7.

[0114] At 808, the method may include obtaining an environment state of the wireless communication network. The operations of 808 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 808 may be performed by a NE as described with reference to Figure 7.

[0115] At 810, the method may include determining, from the consumer function, a reward feedback information associated with the one or more recommendation actions. The operations of 810 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 810 may be performed by a NE as described with reference to Figure 7. [0116] At 812, the method may include adjusting, based on the environment state and reward feedback information, the analytics algorithm. The operations of 812 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 812 may be performed a NE as described with reference to Figure 7.

[0117] It should be noted that the method 800 described herein describes a possible implementation, and that the operations and the steps may be rearranged or otherwise modified and that other implementations are possible.

[0118] Figure 9 illustrates a flowchart of a method 900 in accordance with aspects of the present disclosure. The operations of the method 900 may be implemented by a NE as described herein. In some implementations, the NE may execute a set of instructions to control the function elements of the NE to perform the described functions.

[0119] At 902, the method 900 may include transmitting, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network. The operations of 902 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 902 may be performed by a NE as described with reference to Figure 7.

[0120] At 904, the method 900 may include receiving, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm. The operations of 904 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 904 may be performed by a NE as described with reference to Figure 7.

[0121] At 906, the method 900 may include transmitting, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm. The operations of 906 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 906 may be performed a NE as described with reference to Figure 7. [0122] Figure 10 illustrates a flowchart of a method 1000 in accordance with aspects of the present disclosure. The operations of the method 1000 may be implemented by a processor as described herein. In some implementations, the processor may execute a set of instructions to control the function elements of an NE to perform the described functions.

[0123] At 1002, the method 1000 may include inputting a first request for recommendations or prescriptive analytics of a wireless communication network. The operations of 1002 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1002 may be performed by a processor as described with reference to Figure 6.

[0124] At 1004, the method 1000 may include obtaining, using an analytics algorithm, one or more recommendation actions based on the first request. The operations of 1004 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1004 may be performed by a processor as described with reference to Figure 6.

[0125] At 1006, the method 1000 may include outputting a first response comprising the one or more recommendation actions. The operations of 1006 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1006 may be performed a processor as described with reference to Figure 6.

[0126] At 1008, the method 1000 may include inputting an environment state of the wireless communication network. The operations of 1008 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1008 may be performed by a processor as described with reference to Figure 6.

[0127] At 1010, the method 1000 may include inputting a reward feedback information associated with the one or more recommendation actions. The operations of 1010 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1010 may be performed by a processor as described with reference to Figure 6.

[0128] At 1012, the method 1000 may include adjusting, based on the environment state and reward feedback information, the analytics algorithm. The operations of 1012 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1012 may be performed a processor as described with reference to Figure 6.

[0129] Figure 11 illustrates a flowchart of a method 1100 in accordance with aspects of the present disclosure. The operations of the method 1100 may be implemented by a processor as described herein. In some implementations, the processor may execute a set of instructions to control the function elements of an NE to perform the described functions.

[0130] At 1102, the method 1100 may include outputting a first request for recommendations or prescriptive analytics of a wireless communication network. The operations of 1102 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1102 may be performed by a processor as described with reference to Figure 6.

[0131] At 1104, the method 1100 may include inputting a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm. The operations of 1104 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1104 may be performed by a processor as described with reference to Figure 6.

[0132] At 1106, the method 1100 may include outputting a third response comprising reward feedback information for adjusting the analytics algorithm. The operations of 1106 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 1106 may be performed a processor as described with reference to Figure 6.

[0133] The disclosure herein provides an analytics network function for wireless communication, comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to cause the analytics network function to: receive, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determine, using an analytics algorithm, one or more recommendation actions based on the first request; transmit, to the consumer function, a first response comprising the one or more recommendation actions; obtain an environment state of the wireless communication network; determine, from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjust, based on the environment state and reward feedback information, the analytics algorithm.

[0134] Traditionally, network analytics provide an insight to other NFs that can use the insight to take a decision. In 3 GPP Rel-19 it is anticipated that further enhancements, focusing on recommendations, will be introduced to support network data analytics function (NWDAF) assisted policy control and to address network abnormal behaviour.

[0135] The main idea is based on the observation that the NWDAF can gather data from various NFs (5GC NFs, AF, 0AM) and hence can have a wide variety of knowledge. The NWDAF can this be enhanced to assist a policy control function (PCF) in determining, for instance, quality of service (QoS) parameters that can achieve expected service experience requirements. Put differently, the NWDAF instead of only providing an insight, can offer recommendations as a set of potential solutions, out of which a consumer can choose the most suitable solution.

[0136] Artificial intelligence (Al) / machine learning (ML) models employed in NWDAF currently adopt either supervised or unsupervised learning. These types of models provide an insight based on a model trained behaviour that relies on collected data. They react slowly to dynamic conditions and unforeseen situations, since new training is needed to adjust the model behaviour in the inference phase. RL, in contrast, relies on an Agent that uses a goal directed strategy to provide recommendation actions. The RL agent adopts a policy that learns to perform a task through an RL algorithm, which interacts with the environment (via the means of an action, observation and feedback reward provision).

[0137] Current analytics proposals are introducing the capability of prescriptive analytics or recommendations in the 5G core network. However, these proposals lack a solution framework. More specifically, there are some solutions introduced that are based on reinforcement learning (RL), but these are generic. Such generic proposals lack a description of how recommendation reports shall be modelled, how the network environment and reward feedback shall be communicated and how a prescriptive analytics network function (NF) shall process this information.

[0138] Current analytics proposals do not describe the components of RL based perspective analytics in this detail.

[0139] The disclosure herein provides apparatuses and methods that introduce an RL mechanism in the 5G core network. Specifically, mechanisms are introduced into the 5G core network to handle RL and how to model: RL agent actions as recommendations; network environment observation as state; and the feedback reward as a measure of goal impact or success/improvement. Furthermore, mechanisms are introduced to process the reward feedback considering the vendor identity and other mechanisms that can combine feedback from various sources to assure a fare evaluation.

[0140] The analytics network function tends to be able to offer recommendations or prescriptive analytics using RL in the 5G core by introducing a logical functionality related to prescriptive analytics that can either co-exist with the existing analytics services or can form a new network function dedicated for providing prescriptive analytics. More specifically, a parameter may be introduced to allow the discovery of a recommendation capability when offering other existing analytics services or a new service as will be further discussed herein.

[0141] The analytics network function may be referred to as a ‘first’ network function. The consumer function may be referred to as a ‘second’ network function.

[0142] The first request may be in the form of a request or a subscription. The first request for recommendations or prescriptive analytics may be a request for one or more recommendations (for instance for a policy provision).

[0143] The analytics algorithm may be a prescriptive analytics algorithm that generates recommendation actions. The analytics algorithm may be a reinforced learning analytic algorithm.

[0144] The environment state may be obtained by monitoring the wireless communication network directly and/or requesting the environment state from one or more network entities (or subscribing thereto). The environment state may characterize the state of the wireless communication network after executing a recommendation action.

[0145] The analytics algorithm provides recommendation actions based at least on the first request. The analytics network function then determines the environment state of the wireless communication network and feedback from the consumer function in order to adjust the analytics algorithm. Put differently, the first network function uses reinforcement learning to improve the analytics algorithm.

[0146] The first request for recommendations or prescriptive analytics may comprise at least one of: an identifier for an analytics or for a recommendation; an identifier for a preferred model or analytics algorithm; a filter information; a time schedule information; a time duration for a recommendation; a use case context; a target for a recommendation; a required level of accuracy; an input data; a reporting threshold; and a notification address.

[0147] The filter information, time schedule information, time duration, use case context, level of accuracy and input data may be preferred filter information, preferred time schedule information, preferred time duration, preferred use case context, preferred level of accuracy and preferred input data.

[0148] The time schedule information may indicate when the recommendation action/s are required i.e., on a regular basis, at specific times, urgently, or delay tolerant.

[0149] The target of a recommendation may be objects for which recommendations are requested including entities such as specific user equipment (UE), a group of UEs or any/all UEs.

[0150] The required level of accuracy may be the level of accuracy of the recommendation actions including low, medium, high, highest.

[0151] The input data may comprise data samples, sample granularity, data sources and/or data statistics for producing recommendation actions.

[0152] The reporting threshold/s may be conditions on the level of each recommendation that when reached shall be notified.

[0153] The notification address may include a correlation ID. [0154] The first response may further comprise at least one of: an identifier for a task associated with the one or more recommendation actions; an identifier for one or more components associated with the one or more recommendation actions; a purpose of the one or more recommendation actions; a duration and time schedule of the one or more recommendation actions; a timestamp of the production of the one or more recommendation actions; a filter information related to a location and one or more device capabilities; a network context information and/or use case context information related to the applicability of the one or more recommendation actions; an expected reward feedback information related to a capability of the consumer function to provide reward feedback associated with the one or more recommendation actions; and a confidence of the one or more recommendation actions.

[0155] The identifiers for the components may comprise component names which may be used for a given task such as KPIs, throughput, latency, SLA/QoS parameters. The identifier for the task, also referred to as a task ID, may be an umbrella ID such as a policy identifier.

[0156] The purpose of the recommendation action/s may include the calculation or selection of a policy or network entity, the purpose of negotiation or for feasibility check.

[0157] The filter information may include an area or location where the one or more recommendation actions are applicable. The filter information may include UE conditions such as UE mobility and communication patterns. The filter information may include a device type i.e., mobile phone, drone, vehicle, sensor device.

[0158] The expected reward feedback information may include timing, type, QoS achieved.

[0159] The use case context information may comprise, for example, energy-saving.

[0160] The at least one processor may be configured to cause the analytics network function to obtain the environment state by causing the analytics network function to: transmit, to the wireless communication network, a second request for the environment state; and receive, from the wireless communication network, a second response comprising the environment state. [0161] The second request may be a subscription to one or more entities of the wireless communication network.

[0162] The environment state may comprise at least one of: an analytics in the form of one or more predictions and/or statistics related to the wireless communication network or a service thereof; an equipment configuration data provided by a network management function of the wireless communication network; a performance measurement related to a service level agreement of the consumer function or a configuration management information provided by the 0AM.

[0163] The analytics may comprise a network performance parameter or plurality of performance parameters such as load or congestion level. The equipment configuration management data may comprise an energy-saving state of a network equipment, for instance, or a user mobility. The performance measurement may comprise a service experience including one or more parameters such as a throughput or QoS.

[0164] The reward feedback information comprises at least one of: a value for reward feedback; a timestamp indicating when the reward feedback information was issued; a validity period related to the reward feedback information; a usage description of the one or more recommendation actions; a confidence related to the reward feedback information; an identifier for a vendor related to the reward feedback information; an identifier for an analytics or a recommendation or a session related to the reward feedback information; and an information related to further reward feedback reporting.

[0165] The value for reward feedback may be a real value (i.e., QoS achieved) or a binary value (success/failure) or a quantified value showing a degree of satisfaction. The usage description may be for an event ID and may include a time schedule when used, a time duration, a use case context and/or a target object.

[0166] The at least one processor may be configured to cause the analytics network function to determine the reward feedback information from the consumer function by causing the analytics network function to: determine the reward feedback information based at least partly on the identifier for the vendor. [0167] The identifier for the vendor may be used to accept or reject reward feedback information from certain vendors and/or mark and compare the rewards from different vendors to create knowledge of how different vendors perceive the recommendation actions. For instance, the following information may be compared: the usage of recommendation actions; the time frame; the time schedule of use; the geographic area of use; the service or application or slice using the recommendation actions; a mobile user type and/or mobility profile.

[0168] The at least one processor may be configured to cause the analytics network function to determine the reward feedback information by causing the analytics network function to: transmit, to the consumer function, a third request for reward feedback information; and receive, from the consumer function, a third response comprising the reward feedback information.

[0169] The at least one processor may be configured to cause the analytics network entity to transmit the third request by causing the analytics network entity to: transmit the third request separate to the first response; or transmit the third request with the first response.

[0170] The third request may be ‘piggybacked’ with/on/in the first response. Alternatively, it may be provided separately. The third request may be a subscription.

[0171] The third request may comprise at least one of: a desired confidence or accuracy; the identifier for an analytics or a recommendation or a session related to the reward feedback information; a target period for reward feedback reporting; a time schedule for reward feedback reporting; and a notification address for reward feedback reporting.

[0172] The at least one processor may be configured to cause the analytics network function to: determine respective reward feedback information from each of a plurality of consumer functions; and adjust the analytics algorithm based on a combination of the respective reward feedback information. [0173] Multiple reward feedback information from different consumer functions may be considered before adjusting the analytics algorithm (considering the timeframe for collecting reward feedback/reports or other conditions such as reporting thresholds).

[0174] The at least one processor may be configured to cause the analytics network function to: register, with a network repository function, one or more capabilities of the analytics network function for performing prescriptive analytics; or request a management system of the wireless communication system to configure a network repository function with one or more capabilities of the analytics network function for performing prescriptive analytics.

[0175] The registration may be performed directly between the analytics network function and NRF, or may alternatively by performed using the management system (such as an 0AM function) of the wireless communication system.

[0176] The one or more capabilities may comprise at least one of: an identifier for an analytics or recommendation; an identifier for the analytics algorithm; an information identifying other network functions with permission to use the analytics network function for prescriptive analytics; an interoperability information; a use case context information; a filter information; an input data information; and an expected accuracy level of the model or analytics algorithm.

[0177] The analytics network function may comprise one of: a networks data analytics function (NWDAF); a logical function; and another network function for providing prescriptive analytics.

[0178] The consumer function may comprise a policy control function (PCF).

[0179] The PCF may receive a request for policy provision with a specified SLA and may hence require recommendations to select or renegotiate a policy. The PCF may discover the analytics network function from a NRF and may select said NRF based on the SLA.

[0180] The disclosure herein further provides a consumer function for wireless communication comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to cause the consumer function to: transmit, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network; receive, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and transmit, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm.

[0181] The first request for recommendations or prescriptive analytics may comprise at least one of: an identifier for an analytics or for a recommendation; an identifier for a preferred model or analytics algorithm; a filter information; a time schedule information; a time duration for a recommendation; a use case context; a target for a recommendation; a required level of accuracy; an input data; a reporting threshold; and a notification address.

[0182] The first response may further comprise at least one of: an identifier for a task associated with the one or more recommendation actions; an identifier for one or more components associated with the one or more recommendation actions; a purpose of the one or more recommendation actions; a duration and time schedule of the one or more recommendation actions; a timestamp of the production of the one or more recommendation actions; a filter information related to location and device capabilities; a network context information and/or use case context information related to the applicability of the one or more recommendation actions; an expected reward feedback information related to a capability of the consumer function to provide reward feedback associated with the one or more recommendation actions; and a confidence of the one or more recommendation actions.

[0183] The disclosure herein further provides a method performed by an analytics network function, comprising: receiving, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determining, using an analytics algorithm, one or more recommendation actions based on the first request; transmitting, to the consumer function, a first response comprising the one or more recommendation actions; obtaining an environment state of the wireless communication network; determining, from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjusting, based on the environment state and reward feedback information, the analytics algorithm.

[0184] The first request for recommendations or prescriptive analytics may comprise at least one of: an identifier for an analytics or for a recommendation; an identifier for a preferred model or analytics algorithm; a filter information; a time schedule information; a time duration for a recommendation; a use case context; a target for a recommendation; a required level of accuracy; an input data; a reporting threshold; and a notification address.

[0185] The first response may further comprise at least one of: an identifier for a task associated with the one or more recommendation actions; an identifier for one or more components associated with the one or more recommendation actions; a purpose of the one or more recommendation actions; a duration and time schedule of the one or more recommendation actions; a timestamp of the production of the one or more recommendation actions; a filter information related to location and device capabilities; a network context information and/or use case context information related to the applicability of the one or more recommendation actions; an expected reward feedback information related to a capability of the consumer function to provide reward feedback associated with the one or more recommendation actions; and a confidence of the one or more recommendation actions.

[0186] The obtaining the environment state may comprise transmitting, to the wireless communication network, a second request for the environment state; and receiving, from the wireless communication network, a second response comprising the environment state.

[0187] The environment state may comprise at least one of: an analytics in the form of one or more predictions and/or statistics related to the wireless communication network or a service thereof; an equipment configuration data provided by a network management function of the wireless communication network; a performance measurement related to a service level agreement of the consumer function.

[0188] The reward feedback information may comprise at least one of: a value for reward feedback; a timestamp indicating when the reward feedback information was issued; a validity period related to the reward feedback information; a usage description of the one or more recommendation actions; a confidence related to the reward feedback information; an identifier for a vendor related to the reward feedback information; an identifier for an analytics or a recommendation or a session related to the reward feedback information; and an information related to further reward feedback reporting.

[0189] The determining the reward feedback information from the consumer function may comprise determining the reward feedback information based at least partly on the identifier for the vendor.

[0190] The determining the reward feedback information may comprise transmitting, to the consumer function, a third request for reward feedback information; and receiving, from the consumer function, a third response comprising the reward feedback information.

[0191] The method may comprise transmitting the third request separate to the first response; or transmitting the third request with the first response.

[0192] The third request may comprise at least one of: a desired confidence or accuracy; the identifier for an analytics or a recommendation or a session related to the reward feedback information; a target period for reward feedback reporting; a time schedule for reward feedback reporting; and a notification address for reward feedback reporting.

[0193] The method may comprise determining respective reward feedback information from each of a plurality of consumer functions; and adjusting the analytics algorithm based on a combination of the respective reward feedback information.

[0194] The method may comprise registering, with a network repository function ‘NRF’, one or more capabilities of the analytics network function for performing prescriptive analytics. The method may comprise requesting a management system of the wireless communication system to configure a network repository function with one or more capabilities of the analytics network function for performing prescriptive analytics.

[0195] The one or more capabilities may comprise at least one of: an identifier for an analytics or recommendation; an identifier for the analytics algorithm; an information identifying other network functions with permission to use the analytics network function for prescriptive analytics; an interoperability information; a use case context information; a filter information; an input data information; and an expected accuracy level of the model or analytics algorithm.

[0196] The analytics network function may comprise one of: a networks data analytics function ‘NWDAF’; a logical function; and another network function for providing prescriptive analytics.

[0197] The consumer function may comprise a policy control function ‘PCF’.

[0198] The disclosure herein further provides a method performed by a consumer function, comprising: transmitting, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network; receiving, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and transmitting, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm.

[0199] The first request for recommendations or prescriptive analytics may comprise at least one of: an identifier for an analytics or for a recommendation; an identifier for a preferred model or analytics algorithm; a filter information; a time schedule information; a time duration for a recommendation; a use case context; a target for a recommendation; a required level of accuracy; an input data; a reporting threshold; and a notification address.

[0200] The first response may further comprise at least one of: an identifier for a task associated with the one or more recommendation actions; an identifier for one or more components associated with the one or more recommendation actions; a purpose of the one or more recommendation actions; a duration and time schedule of the one or more recommendation actions; a timestamp of the production of the one or more recommendation actions; a filter information related to location and device capabilities; a network context information and/or use case context information related to the applicability of the one or more recommendation actions; an expected reward feedback information related to a capability of the consumer function to provide reward feedback associated with the one or more recommendation actions; and a confidence of the one or more recommendation actions.

[0201] The disclosure herein further provides, a processor for wireless communication, comprising: at least one controller coupled with at least one memory and configured to cause the processor to: input a first request for recommendations or prescriptive analytics of a wireless communication network; obtain, using a analytics algorithm, one or more recommendation actions based on the first request; output a first response comprising the one or more recommendation actions; input an environment state of the wireless communication network; input a reward feedback information associated with the one or more recommendation actions; and adjust, based on the environment state and reward feedback information, the analytics algorithm.

[0202] The disclosure herein further provides a method performed by a processor, comprising: inputting a first request for recommendations or prescriptive analytics of a wireless communication network; obtaining, using a analytics algorithm, one or more recommendation actions based on the first request; outputting a first response comprising the one or more recommendation actions; inputting an environment state of the wireless communication network; inputting a reward feedback information associated with the one or more recommendation actions; and adjusting, based on the environment state and reward feedback information, the analytics algorithm.

[0203] The disclosure herein further provides a processor for wireless communication, comprising: at least one controller coupled with at least one memory and configured to cause the processor to: output a first request for recommendations or prescriptive analytics of a wireless communication network; input a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and output a third response comprising reward feedback information for adjusting the analytics algorithm.

[0204] The disclosure herein further provides a method performed by a processor comprising: outputting a first request for recommendations or prescriptive analytics of a wireless communication network; inputting a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and outputting a third response comprising reward feedback information for adjusting the analytics algorithm.

[0205] Current analytics proposals are introducing the capability of prescriptive analytics or recommendations in the 5G core network. However, these proposals lack a solution framework to realize this. There are some solutions introduced based on RL but these are generic, i.e., they lack a description of how the recommendation actions/recommendation reports shall be modelled, how the network environment and reward feedback shall be communicated and how the prescriptive analytics NF shall process this information.

[0206] The disclosure herein relates to apparatuses (i.e., network functions) and methods introducing an RL mechanism in the 5G core network. It introduces, in the current 5G core mechanisms, how to handle RL and how to model: (i) the RL Agent actions as recommendations, (ii) the network environment observation as state information and (iii) the feedback reward as a measure of goal impact or success/improvement. In addition, it introduces mechanisms to process the reward feedback considering the vendor identity and other mechanisms that can combine feedback from various sources to assure a fare evaluation.

[0207] Current proposals do not describe the components of RL based perspective analytics in this level of detail. An exemplar use of such an apparatus and method is using prescriptive analytics for policy provision.

[0208] There is provided an apparatus and a method for assisting an analytics network function to: offer prescriptive analytics in a 5G core network by introducing a logical functionality related to prescriptive analytics that can either co-exist with the existing analytics services or can form a new network function dedicated for providing prescriptive analytics.

[0209] A parameter may be introduced to allow the discovery of a recommendation capability when offering other existing analytics services or a new service.

[0210] The analytics network function may register in a repository indicating its capabilities, receive a request for prescriptive analytics, transmit a response including recommendation actions, request consumer feedback information to adjust the recommendation production, and collect an environment state to adjust the recommendation production.

[0211] The analytics network function may register including any of the following information: the analytics service identifier or recommendation identifier, the model identifier, other network function information that may use it, interoperability information, the use case context, filter information, input data information, accuracy level information.

[0212] The analytics network function may issue a request that can include any of the following information: the analytics service identifier or recommendation identifier, the preferred model identifier, the preferred filter information, the preferred time schedule, the preferred time duration related to the recommendation, the preferred use case context, the target of recommendation, the preferred level of accuracy, the preferred input data, reporting thresholds, the notification address.

[0213] The network function may provide a recommendation report to the consumer that requested prescriptive analytics or recommendations that can include any of the following information: the recommendation data, the task identifier and the related components names associated with the recommendation data, the purpose that serves the recommendation data, the duration and time schedule related to the recommendation data, the timestamp related to the production of the recommendation data, filter information with respect to location and device capabilities, network context and/or use case context information related to the applicability of the recommendation data, an expected reward feedback information related to the capability of the consumer to provide reward feedback related to the recommendation data, a confidence degree related to the recommendation data.

[0214] The analytics network function may receive environment information in any of the following forms including: analytics output in the form of predictions or statistics related to the network or service, equipment configuration data provided by the network management, performance measurements related to the service level agreement (SLA) of the consumer (i.e., a policy consumer - such as a PCF consumer). [0215] The analytics network function may receive reward feedback from the analytics consumer (i.e., the PCF) that may contain any of the following information including: the reward feedback value, a time stamp related to the reward feedback value, a validity period related to the reward feedback, a usage description of the prescriptive analytics or recommendation action, the confidence degree related to the reward feedback, the vendor identifier related to the consumer that issued the reward feedback, the analytics identifier or recommendation identifier or session identifier that the issued reward feedback is related to, information related to expected reward feedback reports.

[0216] The analytics network function may request to receive reward feedback from the analytics consumer (i.e., the PCF) based on the consumer vendor identifier.

[0217] The analytics network function may request to receive reward feedback from multiple analytics consumers (i.e., the PCF) and wait for a configure time interval before it processes them collaboratively based on a configured logic.

[0218] The analytics network function may issue a request for reward feedback from the analytics consumer (i.e., the PCF) that may contain any of the following information including: the analytics service identifier or recommendation identifier, the session identifier or feedback identifier, the target period related to the feedback reward report, the time schedule for providing a feedback reward report, the desired accuracy, the notification address.

[0219] The analytics network function may request to receive reward feedback from an analytics consumer (i.e., the PCF) using: the recommendation report and piggybacking thereon the reward feedback request, or by using a separate reward recommendation request or subscription.

[0220] It should be noted that the method described herein describes a possible implementation, and that the operations and the steps may be rearranged or otherwise modified and that other implementations are possible.

[0221] The description herein is provided to enable a person having ordinary skill in the art to make or use the disclosure. Various modifications to the disclosure will be apparent to a person having ordinary skill in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein but is to be accorded the broadest scope consistent with the principles and novel features disclosed herein.

[0222] The following abbreviations are relevant in the field addressed by this document: 3GPP, 3rd Generation Partnership Project; 5G, 5th Generation of Mobile Communication; AI/ML, Artificial Intelligence/Machine Learning; ADRF, Analytical Data Repository Function; AF, Application Function; AnLF, Analytics Logical Function; BSF, Binding Support Function; CM, Configuration Management; DCAF, Data Collection AF; DCCF, Data Collection Coordination Functionality; gNB, general Node B; KPI, Key Performance Indicator; MDA, Management Data Analytics; MF, Management Function; MFAF, Messaging Framework Adaptor Function; MnS, Management Service; MTLF, Model Training Logical Function; NEF, Network Exposure Function; NF, Network Function; NRF, Network Repository Function ; NWDAF, Network Data Analytics Function; 0AM, Operations, Administration and Maintenance; PCF, Policy Control Function; PM, Performance Measurement; QoS, Quality of Service; RL, Reinforcement Learning ; SLA, Service Level Agreement ; S-NSSAI, Single - Network Slice Selection Assistance Information; TA, Tracking Area; UDM, User Data manager ; UDR, User Data Repository; UPF, User plane Function; and UE, User Equipment.

Claims

CLAIMS What is claimed is:

1. An analytics network function for wireless communication, comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to cause the analytics network function to: receive, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determine, using an analytics algorithm, one or more recommendation actions based on the first request; transmit, to the consumer function, a first response comprising the one or more recommendation actions; obtain an environment state of the wireless communication network; determine, from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjust, based on the environment state and reward feedback information, the analytics algorithm.

2. The analytics network function of claim 1, wherein the first request for recommendations or prescriptive analytics comprises at least one of: an identifier for an analytics or for a recommendation; an identifier for a preferred model or analytics algorithm; a filter information; a time schedule information; a time duration for a recommendation; a use case context; a target for a recommendation; ; a required level of accuracy; an input data; a reporting threshold; and a notification address.

3. The analytics network function of any preceding claim, wherein the first response further comprises at least one of: an identifier for a task associated with the one or more recommendation actions; an identifier for one or more components associated with the one or more recommendation actions; a purpose of the one or more recommendation actions; a duration and time schedule of the one or more recommendation actions; a timestamp of the production of the one or more recommendation actions; a filter information related to a location and one or more device capabilities; a network context information and/or use case context information related to the applicability of the one or more recommendation actions; an expected reward feedback information related to a capability of the consumer function to provide reward feedback associated with the one or more recommendation actions; and a confidence of the one or more recommendation actions.

4. The analytics network function of any preceding claim, wherein the at least one processor is configured to cause the analytics network function to obtain the environment state by causing the analytics network function to: transmit, to the wireless communication network, a second request for the environment state; and receive, from the wireless communication network, a second response comprising the environment state.

5. The analytics network function of any preceding claim, wherein the environment state comprises at least one of: an analytics in the form of one or more predictions and/or statistics related to the wireless communication network or a service thereof; an equipment configuration data provided by a network management function of the wireless communication network; a performance measurement related to a service level agreement of the consumer function.

6. The analytics network function of any preceding claim, wherein the reward feedback information comprises at least one of: a value for reward feedback; a timestamp indicating when the reward feedback information was issued; a validity period related to the reward feedback information; a usage description of the one or more recommendation actions; a confidence related to the reward feedback information; an identifier for a vendor related to the reward feedback information; an identifier for an analytics or a recommendation or a session related to the reward feedback information; and an information related to further reward feedback reporting.

7. The analytics network function of claim 6, wherein the at least one processor is configured to cause the analytics network function to determine the reward feedback information from the consumer function by causing the analytics network function to: determine the reward feedback information based at least partly on the identifier for the vendor.

8. The analytics network function of any one of claims 6-7, wherein the at least one processor is configured to cause the analytics network function to determine the reward feedback information by causing the analytics network function to: transmit, to the consumer function, a third request for reward feedback information; and receive, from the consumer function, a third response comprising the reward feedback information.

9. The analytics network function of claim 8, wherein the at least one processor is configured to cause the analytics network entity to transmit the third request by causing the analytics network entity to: transmit the third request separate to the first response; or transmit the third request with the first response.

10. The analytics network function of any one of claims 6-9, wherein the third request comprises at least one of: a desired confidence or accuracy; the identifier for an analytics or a recommendation or a session related to the reward feedback information; a target period for reward feedback reporting; a time schedule for reward feedback reporting; and a notification address for reward feedback reporting.

11. The analytics network function of any preceding claim, wherein the at least one processor is configured to cause the analytics network function to: determine respective reward feedback information from each of a plurality of consumer functions; and adjust the analytics algorithm based on a combination of the respective reward feedback information.

12. The analytics network function of any preceding claim, wherein the at least one processor is configured to cause the analytics network function to: register, with a network repository function, one or more capabilities of the analytics network function for performing prescriptive analytics; or request a management system of the wireless communication system to configure a network repository function with one or more capabilities of the analytics network function for performing prescriptive analytics.

13. The analytics network function of claim 12, wherein the one or more capabilities comprise at least one of: an identifier for an analytics or recommendation; an identifier for the analytics algorithm; an information identifying other network functions with permission to use the analytics network function for prescriptive analytics; an interoperability information; a use case context information; a filter information; an input data information; and an expected accuracy level of the model or analytics algorithm.

14. The analytics network function of any preceding claim, wherein the analytics network function comprises one of: a networks data analytics function; a logical function; and another network function for providing prescriptive analytics.

15. The analytics network function of any preceding claim, wherein the consumer function comprises a policy control function.

16. A consumer function for wireless communication, comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to cause the consumer function to: transmit, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network; receive, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and transmit, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm.

17. The consumer function of claim 16, wherein the first request for recommendations or prescriptive analytics comprises at least one of: an identifier for an analytics or for a recommendation; an identifier for a preferred model or analytics algorithm; a filter information; a time schedule information; a time duration for a recommendation; a use case context; a target for a recommendation; ; a required level of accuracy; an input data; a reporting threshold; and a notification address.

18. The consumer function of any one of claims 16-17, wherein the first response further comprises at least one of: an identifier for a task associated with the one or more recommendation actions; an identifier for one or more components associated with the one or more recommendation actions; a purpose of the one or more recommendation actions; a duration and time schedule of the one or more recommendation actions; a timestamp of the production of the one or more recommendation actions; a filter information related to location and device capabilities; a network context information and/or use case context information related to the applicability of the one or more recommendation actions; an expected reward feedback information related to a capability of the consumer function to provide reward feedback associated with the one or more recommendation actions; and a confidence of the one or more recommendation actions.

19. A method performed by an analytics network function, comprising: receiving, from a consumer function, a first request for recommendations or prescriptive analytics of a wireless communication network; determining, using an analytics algorithm, one or more recommendation actions based on the first request; transmitting, to the consumer function, a first response comprising the one or more recommendation actions; obtaining an environment state of the wireless communication network; determining, from the consumer function, a reward feedback information associated with the one or more recommendation actions; and adjusting, based on the environment state and reward feedback information, the analytics algorithm.

20. A method performed by a consumer function, comprising: transmitting, to an analytics network function, a first request for recommendations or prescriptive analytics of a wireless communication network; receiving, from the analytics network function, a first response comprising one or more recommendation actions, the one or more recommendation actions having been determined using an analytics algorithm; and transmitting, to the analytics network function, a third response comprising reward feedback information for adjusting the analytics algorithm.